Due to our steady growth we now need to be conscious of how much burden we’re putting on Strava’s servers as they rate limit apps to 600 hits / 15 minutes and 30K hits / day.
Our original sync logic was very basic polling Strava 16 hours after your last activity started. The result has been a increasingly high number of hits to Strava without finding a new activity. This isn’t going to work as we scale our user base into the tens of thousands.
It occurred to me that predicting the time until your next activity would be an interesting approach given we have good history of your patterns and training sets already geared at Activity level predictions. We also felt like some function to down step the frequency based on a number of unsuccessful attempts would be a good addition to the system.
We’ve made the following changes this week to try to reduce the rate at which we’re hitting Strava while still syncing your activities on a timely basis:
- Added a new prediction function called Next Activity that predicts the number of hours until your next activity starts. You can see the value of this prediction on the activity page within the app website. This prediction uses the same training sets as the Gear and Activity Type predictions with a different target variable we’ve added that captured the hours until your next activity for each of the activities we’ve processed.
- Begin to keep track of number of attempts to find a new activity for each athlete.
- Use the number of attempts to gradually reduce (step-down) the frequency we hit Strava for athletes with an increasing number of sync attempts.
The sync frequency step-down logic works as follows: We try every other hour for the first day, move to once a day for a week, move to once a week for a month followed by once a month until an activity is found.
Please note that we have seen a fairly frequent occurrence of the Next Activity prediction yielding a negative number of hours. We override the next sync to 20 hours when this occurs while we fine tune the algorithm to yield the results we’re looking for. For the data scientist in you: we’re currently using a Ridge Regression algorithm and we may move to a Classifier using [Same Day, Next Day, Rest Day, Next Week] or something along those lines to generalize it a bit.
6/19 Update: We’ve added the time until next sync, the number of attempts and a link to allow you to ask strive.ai to sync your activities in the next batch to the dashboard page.
Please feel free to send feedback or questions on this or any topic to firstname.lastname@example.org.
Thanks again for your continued support!