DataBase latency
Incident Report for Ubidots
Resolved
We have finished the new cluster node deployment, and also the Databases Join tasks. We consider this migration as completed
Posted May 19, 2021 - 13:30 UTC
Update
We have successfully finished the repair action of our NoSQL database and have included a new baremetal node in our cluster. We need to run a new Join + repair action with this new node to finish our migration. We expect minor latencies during this process that should not impact your daily Ubidots usage. We expect to finish this process next week.
Posted May 04, 2021 - 13:02 UTC
Update
We have identified a pattern where our time-series database cluster experiences a 15-minute degradation at 11:00, 12:00, 1:00, and 2:00 GMT-5 every day. This degradation is approximately 1/10th of the capacity (the SAN throughput goes from 200MB/s to 20MB/s), causing some dashboard widgets to display an error message during such times. The degradation slowly fades after a few minutes of such window, once the pending requests are processed.

About the SAN issues, our service provider is suggesting a server reboot will likely solve this, as it would discard OS-related issues. Unfortunately, this is not possible at this time because, given the initial degradation, we spun up a new server to increase overall cluster performance, and such a "cluster joining process" takes a few days, given the size of our database.

We are expecting that the new cluster node will successfully join within a week, after which we will be able to gradually reboot each cluster node to a new, fresh start.
Posted Apr 29, 2021 - 12:31 UTC
Update
We are still working on joining a new node to our cluster in order to alleviate the workload of our database access, unfortunately, this action is taking more than expected. We have detected additional SAN latency at our bare-metal infrastructure provider, and we are working with them in order to improve the write/read rate in our HDDs, this should improve the Database speed access. In the mean time, some users may experience latency when trying to access their data from the Dashboard or Variable view at certain times of the day. Synthetic variables calculation is expected to take longer than usually.
There are no issues with data ingestion.
Posted Apr 27, 2021 - 11:55 UTC
Update
We are experiencing latency for data retrieval for some requests, it is not a general issue but may affect some users that attempt to load dashboards or variables. The problem is not general and is raised just in certain moments during the daily Ubidots usage. We are working in the deployment of new cluster instances to solve it.
Posted Apr 22, 2021 - 14:36 UTC
Update
We have deployed a repair action to our cluster, this may derivate latency to our database for data retrieval and synthetic variables calculation. We do not expect any outage in our core systems, and this expected latency should not affect your daily Ubidots usage. This action may take the whole week to be finished
Posted Apr 20, 2021 - 12:46 UTC
Monitoring
We are experiencing latency during query execution at our database cluster, due to this, synthetic variables that usually take 1 minute to be calculated may take from 1:30 to 2:30, this behavior is extended to greater compute time variables. Dashboards and variables and device view may also experience a delay to be loaded. This delay should not be critical for our customer's applications.

We are working to solve this issues as soon as possible
Posted Apr 12, 2021 - 22:32 UTC