Automating a high-touch financial process in Python at Angel Studios
I had the pleasure of working on a really fun project at Angel Studios called Disbursements. It cut a month-long quarterly process into something that can be run in a few hours, and saved about $100,000 annually in wages spent on running the process.
This was a dreaded financial process in which we pay creators on a quarterly cadence to meet our contractual obligations and maintain a relationship with them. We get revenue from different platforms, we split that revenue up by creator, and then we pay the creators for the revenue they earned. Sounds simple enough.
However the reality was that the process has been plagued by problems for years. It took a data scientist a full month every quarter to process the data and generate reports. Reports were often wrong or incomplete - creators would often complain that their historical earnings went down, for example. Reports were also not transparent, so the creators would have no way to know if calculations were done accurately or not. Combining those two things together created a bad relationship with our creators.
Identifying the core problems
First I sat with the guy who was in charge of the process before me for about 7 hours, watching him do the old process. I listened to the team members who had worked with the creators the most and they highlighted all the pain points with the process. Once I had a clear picture of what was going wrong, I was able to outline a plan to redo the process. I had a few key insights which enabled the whole thing.
- The wrong grain. Granularity matters a lot in data - if you create a daily report and then people want to know hourly information, you're toast because that fine-grain detail is gone. The problem with this process is we were using the wrong data model at the wrong granularity. While previously we were mapping a piece of content directly to a creator, instead we needed to map the content to their episode, then to their contract, then to the creator. By doing so we were able to now say how much an individual episode and contract was earning, and were able to start having contracts pay off independently instead of being lumped together by creator, which solved at least 90% of the problems we had with calculating payments correctly. This involved manually remapping thousands of videos to episodes instead of creators, and splitting past earnings and debts up by contract, but it was well worth it.
- No basic tables in place. We needed a source of truth for some basics, like a list of creators with correct names and emails, a list of contracts with contract terms and starting balances, and a list of episodes, with primary and foreign key relationships between those three tables.
- The wrong tools. The process was using a Postgres db, Elixir, LulaTex for creating the report template, some Python notebooks for initial processing of some data, and google sheets for storing and processing some key data. The important question here is "what type of process is this, and what types of tools does this process require?" This is a high-touch financial process, high-touch meaning that there are lots of exceptions - every step of the process is different for every platform, some creators are paid differently than others, and you need visibility into every step of the process to ensure you did it right. Usually, Excel is the king of high-touch financial processes because you can see all the data as you manipulate it, but Excel isn't great for scalable processes - doing steps the same way each time. Python pandas is great for that - data scientists regularly get input data, clean it up and transform it in pandas, and can observe the process as they do transformation steps in Jupyter notebooks, and produce an output. A custom process with repeatability.
- The wrong people. Previously, we were asking data guys to do the job of accounting - keeping track of payments and account balances over time. That was a mistake, and accounting is much better at this process than me. Saying no to some parts of the process that were on my plate was an important part of ensuring that when we did the process again in the future, it would go smoothly. This helped the data team focus on what we do best - cleaning, processing and manipulating the data.
The solution structure
“Excellent breakdown.”“Loving the extra details on this quarter's report :) thanks!”“This looks great! Wow. Thanks so much. :) Look at me finally making money! Yay.”“Thanks for the info, and the clarity of the new statements. Waaaaaayyy better. It’s good to be back in the black :)”“Wow that's a much nicer quarterly statement!”"Great new reports!! Super easy to see everything."“I gotta tell you. I like this breakdown much better. You’re doing such an amazing job. Thank you.”“Thank you for all your hard work on our behalf.”“The new reporting system is excellent. Much appreciated!”
Using excel as a database
Why did it succeed?
I had a lot of fun spending late nights and weekends grinding through this project to get it to the finish line. It was rewarding to take ownership of something hard, understand it deeply and move mountains to make sure it happened on time. What made it fun, and why was I able to do something that had perplexed two data scientists before me at the same time?
First, I was able to simplify the problem. Often times complexity is a symptom that some basics need to be reassessed in order to simplify the problem.
Second, I was able to focus. I took 2 1/2 weeks away from my team with permission from my manager and support from my team members, and was able to only work on this project. That meant I never lost context on the project, I never made a big mistake, and I kept in my head the vision for what needed to be created, what needed to be done, and by when.
Third, I was able to apply experiences from different domains from my experience as a financial analyst doing excel-based processes in Python, and my experience as an analytics engineer solving fundamental problems with better data models.
Being able to apply my strengths to this project, and dedicate all my work time and attention to doing one project really well was a lot of fun. It was particularly fun because the benefits of the successful execution of the project were so apparent - it was a massive relief to everyone involved when it worked. I hope to do more projects like this in the future.
Comments
Post a Comment