My first ever public pet project Lyrics Bot provided me lots of opportunities to experiment, especially in terms of database selection. I had a chance to explore lots of different options and to reach its limits. So now I’d like to share my experiences with these options.
During the course of this project I’ve faced numerous challenges that tested my problem-solving skills. Whether it was optimizing the database for faster retrieval or ensuring data consistency, each challenge taught me its own lessons. Reflecting on these hurdles not only enriched my understanding of database management but also deepened my appreciation of intricate balance between efficiency and reliability in technology.
In the first version of the project I stored everything in RAM to build a POC (Proof Of Concept). I was aware that this approach wouldn’t be sustainable in long term. Fortunately I knew it wouldn’t go in production, so I began exploring data persistence options, starting with the most obvious choice.
VERY BIG JSON FILE
I chose to maintain my RAM state as it was, without altering any code. Instead of altering the code I saved everything to a JSON file by using Node’s standard
fs (file system) package.
I used synchronous file-writing functions. Every time I updated the global database object, I wrote the changes to the file. I hit my first bottleneck when I reached 1,000 users. Due to the frequency and volume of data changes my VPS (Virtual Private Server) couldn’t handle it. This prompted my decision to switch the storage for a second time.
Multiple JSON files
I selected the transition to multiple JSON files to manage the volume more effectively, and it worked well. I also implemented a savvy approach: I adopted the Repository pattern for the data management. I made this decision because I wanted to avoid the logic altering each time I change the data storage method. For now I’m aiming to apply this pattern universally.
In simple terms, to use this pattern you should:
- Define a repository class/object. This class should offer an interface to the code that manipulates the data.
- Modify and access data solely through the repository. Avoid any direct interactions with storage in your business logic. If you ever decide to switch, direct interactions would require updates. If you need specific data manipulations, update the Repository class to provide this functionality.
After adopting multiple JSON files and the Repository pattern I encountered my next challenge when reaching 20,000 users: the write/read operations began delaying responses.
NeDB is akin to SQLite, but designed for the JSON ecosystem. It offers an interface similar to what MongoDB provides for data collection management. If you recall, I implemented the Repository pattern in my code earlier, so this was the only aspect I needed to change. The smooth transition reinforced that I had made the right choice in terms of code structure. Following this, my bot operated seamlessly for years until I encountered the next bottleneck.
When reaching approximately 120,000 users, I stumbled upon one of NeDB’s shortcomings: the problem of reading large files. The system attempted to read the entire file in one go, hitting Node.js’s MAX_STRING_LENGTH limit. This error was a really first for me and took me by surprise. However, thanks to the backup mechanism I had in place my data remained intact. Subsequently I transitioned to MongoDB.
My transitioning to MongoDB was swift and straightforward, primarily due to my prior use of the Repository pattern; I only had to update a single file. I was optimistic with that my MongoDB setup would be long-lasting. However, the unforeseen circumstances surrounding the Russian invasion of Ukraine led to my server crashing, resulted loss of the most of my data and projects. Fortunately I’ve since managed to recover and relocate everything I could to Digital Ocean (My referral link). So now everything is back in operation.
In retrospect, the journey of developing and maintaining my Lyrics Bot has been a tale of continuous learning and adaptation. From the initial choice of storing data in RAM to transitioning through various database systems, each step carried its unique challenges and insights. The unpredictable events, such as server crashes due to external geopolitical factors, reinforced the importance of flexibility, foresight, and the necessity of having robust backup mechanisms. The Repository pattern served as a testament to the value of scalable code architecture, making transitions smoother and more manageable.
Through all the ups and downs this experience has profoundly shaped my perspective on technology, infrastructure management and the impermanence of digital assets. Embracing challenges, being prepared for the unexpected and continuously learning from experiences are the cornerstones of successful digital project management.