In an era of rapid technological advancements, startups are the lifeblood of innovation. However, identifying promising startups for investment has historically been a challenging task, often relying on incomplete data and subjective assessments.
Enter Koble, a trailblazing company that is reshaping the landscape of early-stage startup investing by harnessing the power of AI and Deep Learning with Predibase, a machine learning platform for training and serving open source models. In this case study, we'll delve into how building their platform on top of Predibase accelerated Koble’s time to market and streamlined their development cycle.
“We adopted Predibase to save our team months of effort developing infrastructure for training and serving complex open source LLMs. With Predibase we can experiment and iterate faster with less custom work and have the option to deploy models in our own cloud. Now we don’t need to worry about scaling our infrastructure as we grow because Predibase supports efficient fine-tuning and serving of even the largest models like LLaMA-2-70B in production on A100 GPUs.” - Damian Cristian, co-founder, Koble
The Challenges of Building an AI-Powered Investment Platform
It’s no secret that the world of early-stage investing is one with very high-stakes. With just 2.5% of early-stage investments achieving the coveted 20x+ return and less than 5% of VC funds hitting the 3x ROI target, the challenges are glaring. When choosing seed and pre-seed companies to back, the odds of investing in a successful startup are rarely better than a coin toss.
Koble set out to develop a powerful AI solution that could use publicly available data to automate the evaluation of early stage startup investments and provide them with access to capital in minutes, not months.
Unsurprisingly, such an ambitious goal was faced by proportionately daunting challenges:
- Data Overload: The internet is awash with information about startups, making it difficult for Koble to easily extract relevant insights amidst the noise. Not only is there a large volume of data, but there are many different types of data that need to be evaluated simultaneously, ranging from LinkedIn profiles and resumes to articles, open source projects, blogs, and more.
- Data Quality: Ensuring the accuracy and reliability of data sources is a significant concern. To be successful, the models will need to intelligently prioritize and comprehend a range of data quality across multiple modalities.
- Scalability: In order to reach production scale quickly, Koble needed to develop a model architecture via an experimentation pipeline that lets them iterate quickly and optimize feature selection. Without a flexible pipeline, the team would have to manually test countless feature configurations to find the combination that most reliably predicts success. Not only did adopting Predibase dramatically accelerate experimentation, but it also saved Koble the months it would have taken to build that manual experimentation pipeline.
- Measurability: Not only did the team need a flexible and powerful architecture, but they needed to reliably keep track of their model experiments and confidently track/compare essential performance metrics. All too often, this kind of performance tracking lives in cluttered notebooks or requires implementing yet another managed service.
The Solution: Rapidly Iterating on Custom AI Models with Multimodal Data on Predibase
To overcome these challenges, Koble turned to Predibase, the open source AI infrastructure platform for training and serving task-specific models to power their investment insights platform. Known for its scalability and flexibility, Predibase provided Koble with an end-to-end platform for rapid model experimentation and deployment.
End-to-end model customization and serving in Koble's VPC with Predibase
- Experiment Iteration: As many know, building a perfect machine learning model requires a lot of trial and error. Specifically, the process to evaluate which features are most important for a given task is often very manual and time consuming. Predibase’s model iteration features made this a streamlined process by automatically calculating and surfacing how much each feature contributes to a model’s results. Having this data helps dramatically optimize experimentation costs and iteration times, easily delivering optimal results 10-20x faster than the manual approach.
# Get input data for predictions
input_df = pd.read_csv('input_data.csv')
# Get best model version
model = pc.get_model('winners-predictor', version=38)
# Generate predictions on input data
predictions_df = model.predict('assigned_label', input_df, explanation=True, confidence=True)
Evaluating models in Predibase is as easy as writing 3 lines of code
- Fine-tuning Language Models: A large part of the training data used to develop their solution consisted of unstructured textual data. Predibase’s flexible architecture allows the Koble team to quickly integrate any HuggingFace language model into the pipeline without needing to rework the model architecture or write more than a couple of lines of code. The Koble team could even fine-tune these integrated HuggingFace language models to yield an extremely accurate task-specific model.
- Explainability: Predibase's robust model explainability enabled Koble to provide not only success outlooks for ventures in question, but also granular explanations of the elements driving their model’s predictions.
Analyzing feature importance in Predibase with built-in dashboards
- Privacy: With Predibase’s unique infrastructure design, Koble was able to train and deploy all models in their own cloud environment, mitigating any privacy and security concerns while also retaining full control over all their compute resources.
The Outcome: Reducing the Time it Takes to Deliver AI Insights by Months
Koble was able to create a groundbreaking platform that will revolutionize startup investing. By leveraging Predibase, Koble was able to accelerate their product’s development timeline while increasing their resource and cost efficiency:
- Accelerated time to market: Through adopting Predibase’s platform for open source AI infrastructure, Koble was able to save 4 months of development time.
- Streamlined model experimentation: By using one central repository for all training jobs and performance metrics, the Koble team was able to reliably measure progress across model iterations. By automating feature selection, they were able to iterate on their predictive model 10-20x faster and in just four months of experimenting, were able to train and evaluate over 100 model versions.
- Optimized compute resources: Predibase automatically scales compute resources based on the complexity of training tasks and size of batch inference jobs. This ensures that high-performance GPUs like NVIDIA A100s are available when they’re needed, but also automatically provides low-cost commodity GPUs when they’re not, leading to a significant reduction in AWS compute spend.
Building an AI-powered investment insights application at Koble using Predibase
Conclusion
Koble’s journey from a startup with a vision to a game-changing platform in the world of early-stage investing showcases the transformative power of Predibase. By leveraging Predibase's data science, fine-tuning, and ML-ops capabilities, Koble was able to harness the full potential of AI and disrupt a traditional industry. This case study exemplifies how innovative companies can drive change and achieve remarkable success by building their platforms on the solid foundation provided by Predibase.
Interested in fine-tuning and serving Llama-2 or other popular popular open-source LLMs for free? Sign-up for our free trial to get started on your use case.