FinGPT - Democratize financial data and FinLLMs

6 minute read

Large Language Models (LLMs) have great potential for many industries, including finance. ChatGPT has impressively demonstrated the power of such LLMs for the general public.

FinGPT - Democratize financial data and FinLLMs
Generated with Microsoft Copilot

In early 2023, BloombergGPT, a proprietary LLM for finance, was introduced. BloombergGPT has the advantage that Bloomberg has access to high-quality data. We should democratize the use of such models. The development of open-source financial LLMs (FinLLMs) has some hurdles. There are difficulties in data sourcing, different data formats and inconsistencies in data quality.

FinGPT is a FinLLM for the finance sector. It takes a data-centric approach and provides transparent resources for researchers and practitioners to develop their FinLLMs. FinGPT is developed by the open-source community AI4Finance Foundation.

The team behind FinGPT has the following vision:

“Our vision for FinGPT is to serve as a catalyst for stimulating innovation within the finance domain.” (Quote from [1])

This article contains the most important information on FinGPT. The paper [1] serves as a reference.

AI4Finance

AI4Finance is a non-profit, open-source organization that integrates artificial intelligence (AI) in finance. The Foundation has already had some successes. With FinTech tools such as FinRL and FinRL-Meta, the Foundation is making a decisive contribution to the transformative use of AI in the financial sector.

Financial data for FinLLMs

The performance of FinLLMs depends not only on the model architecture but also on the training data. FinGPT’s data-centric approach involves the collection, preparation and processing of high-quality data.

Type of data

There is a wide range of financial data from different sources. FinGPT addresses the specifics of different data sources. Examples are:

Financial news: Financial news contains information on the global economy and companies. News are up-to-date, timely, dynamic and changes quickly. In addition, financial news has a major influence on traders’ decisions.

Company filings: Company filings and announcements are official documents that contain the financial situation and long-term strategy of a company. The documents contain detailed information on the companies. Furthermore, these documents have been reviewed by the supervisory authorities. The company filings must be submitted regularly, which shows the financial situation. Company announcements have a significant impact on the markets as they influence investor sentiment.

Social Media: Social media reflects public sentiment towards the money market. Changes in public opinion can be captured in real-time through social media. Social media are very complex sources of information.

Trends: Trends can be monitored via websites such as Seeking Alpha, Google Trends and financial blogs. These websites and blogs also contain market forecasts and investment recommendations from analysts and experts. Furthermore, such websites also provide insights into market sentiment.

Challenges

The three main challenges in processing financial data are:

  • High temporal sensitivity
  • High dynamics through news
  • Low signal-to-noise ratio (SNR): The useful information is usually overlaid by a lot of irrelevant information.

These challenges must be taken into account when developing an effective FinLLM. The authors of the paper [1] propose FinGPT as an open-source framework.


Get our e-book LangChain for Finance


What is FinGPT?

FinGPT was developed for the financial industry. You can find it on GitHub. The following figure shows the basic structure of the FinGPT framework.

Compact Overview: FinGPT Framework
Compact Overview: FinGPT Framework (based on [1])

The framework consists of the following four components:

  • Data Source Layer: This layer aims to capture every piece of information in the market. In addition to the data sources presented above, the use of academic datasets for complex financial analysis is also provided. FinGPT uses data acquisition tools to capture structured and unstructured data through different interfaces. Data APIs are used for initial data capture as well as real-time data updates. This approach ensures that the model has been trained on the most up-to-date data. FinNLP, another AI4Finance Foundation project, is also used in this layer.
  • Data Engineering Layer: This layer is responsible for processing the real-time NLP data. State-of-the-art NLP techniques are used. Data cleaning, feature extraction, sentiment analysis, prompt engineering, monitoring and retraining also take place in this layer. This layer aims to overcome the challenges of high temporal sensitivity and low signal-to-noise ratio.
  • LLMs Layer: This layer is the heart of the FinGPT framework. The focus here is on methods for fine-tuning (lightweight adaptation) so that the model is always up to date. This layer includes APIs to establish LLMs for a basic language capability. In addition, FinGPT provides trainable models that users can fine-tune with private data for their applications. Reinforcement learning (RL) and Low-rank Adaptation (LoRA) are used for fine-tuning. The stock market serves as the environment, and feedback comes in the form of stock price changes. The use of established LLMs and fine-tuning saves costs as well as long retraining from scratch.
  • Applications Layer: FinGPT can help professionals and individuals make informed financial decisions. Possible areas of application include robo-advisors, quantitative trading, financial sentiment analysis, financial education and many more.

The developers would like to release the trained model soon.

FinGPT vs. BloombergGPT

BloombergGPT shows impressive finance-specific capability but has a high computational cost. Training requires 1.3 million GPU hours. At an AWS rate of $2.3 per hour, that would be about $3 million. FinGPT is a more cost-effective solution because the existing open-source models are fine-tuned. The developers estimate the cost to be less than $300 per training.

Outlook

FinLLMs are a future version of giving everyone access to powerful financial assistants. In the future, FinLLMs can be customized to users’ risk profiles and financial goals. Furthermore, the open-source idea plays a central role in FinLLMs. Everyone should have access to advanced financial modeling techniques (developers, researchers and experts). It is also important that high-quality financial data is available to train them effectively.

Conclusion

In the financial industry, developers face several challenges, such as high temporal sensitivity, dynamic financial landscape and low signal-to-noise ratio. FinGPT responds to these challenges. It is a cost-effective and flexible framework that uses open-source LLMs and adapts them to the requirements of the financial world.

Thanks so much for reading. Have a great day!

References

[1] Yang, H., Liu, X.Y. and Wang, C.D., 2023. FinGPT: Open-Source Financial Large Language Models. arXiv preprint arXiv:2306.06031.


💡 Do you enjoy our content and want to read super-detailed articles about data science topics? If so, be sure to check out our premium offer!


Disclaimer: All texts, notes and information provided by tinztwins do not constitute investment advice or a recommendation to buy or sell securities. They are for personal information only and reflect the opinion of the authors. No recommendation for a specific investment strategy is given. The authors are NOT LIABLE for any damages caused by the software. The use of the software is at your own risk.

Leave a comment