AlphaMiner: Automated Telegram Deal Flow Management for Crypto VCs

Introducing AlphaMiner, an automation tool for crypto VCs that helps manage deal flow shared on Telegram. This project is inspired by AutoVC by Brent Matterson at Bankless Ventures.

GitHub - seb3point0/AlphaMiner: Telegram bot for capturing deal flow blurbs
Telegram bot for capturing deal flow blurbs. Contribute to seb3point0/AlphaMiner development by creating an account on GitHub.

Note that this is an alpha project and is not guaranteed to work.

Problem: Analysing Deal Blubs from Telegram is Time-Consuming

It's no secret that Telegram is the most commonly used messaging platform for crypto companies. Teams use it to communicate internally and with each other. Venture capital firms and angel investors operating in the industry use Telegram to stay in touch with portfolio companies and share deal flow with other investors.

Crypto investors are accustomed to receiving blurbs about teams raising capital. These blurbs may vary in information depth, quality, and volume. They typically include a description, founder bios, and links to the company website, socials, and pitch deck. Parsing this information and injecting it into a CRM can be time-consuming.

AlphaMiner: An AI-Assisted Solution for Parsing Blurbs

AlphaMiner is a Telegram bot that parses deal flow blurbs and sends the relevant info to a Notion database. Just forward a blurb to the AlphaMiner bot, and it will automagically pop up in Notion moments later.

It uses OpenAI's GPT-4 model to extract and format information from a message in structured JSON. A clean data set can easily be sent to Notion for processing by the investment team or an analyst.

Building AlphaMiner

The challenge with blurbs is that they come in all shapes and sizes. Some contain detailed information about the company's product, team, tractions, fundraising terms, and links to all relevant documents and socials. Others are basic and have little depth. Furthermore, none of the data is structured–each blurb is unique.

The key to extracting the relevant data from the blurb is the prompt. It has to consider edge cases and always be strict enough to return valid JSON.

Extract the following information from the message below (enclosed in `[ ]`) and output it as a JSON object with these keys, when values are defined:

```json
[ 
    { 
        "name": "", /* Company or project name */ 
        "stage": "", /* Stage of fundraise [e.g., Pre-seed, Seed, Series A] */ 
        "type": "", /* Fundraise structure [e.g., Equity, SAFE, SAFT, Token Sale, Private Sale] */ 
        "raising": "", /* Amount raising [e.g., $1.5m] */ 
        "fdv": "", /* FDV or token valuation [e.g., $40m] */ 
        "valuation": "", /* Equity valuation if different from FDV [e.g., "$20m" for a 1:2 token ratio] */ 
        "token_ratio": "", /* Token ratio or token warrant ratio [e.g., "1:1", "1:2"] */ 
        "committed": "", /* Amount committed [e.g., $500k] */ 
        "vesting": "", /* Vesting terms for tokens [e.g., "12 month cliff, 24 month linear vesting", "5% unlocked at launch, 95% 3 months cliff and 12 months linear unlock"] */ 
        "investors": [], /* Current or previous investors, including angels, in an array */ 
        "summary": "", /* Brief summary of the company based on the information provided. Do not include information already extracted [e.g., links, valuation, etc] */ 
        "links": { 
            /* Standard links categorized by type; For links with passwords [e.g., "pw", "password", "passcode", "code"], represent them as objects with "link" and "password" [e.g., { "link": "", "password": "" }], otherwise, use the link as a string */ 
            "website": "", 
            "deck": "", /* Recognize different terms "deck", "pitch deck", "pitch" as "deck"); Most common domains are "docsend.com", "pitch.com" */ 
            "whitepaper": "", /* Common as ".pdf" file or domain "docsend.com" */ 
            "blog": "", /* Common domains are "blog.", "medium.com", "mirror.xyz" */ 
            "demo": "", 
            "documentation": "", /* Common domains are "docs." */ 
            "data_room": "", /* Common domains "docsend.com", "notion.site" */ 
            "roadmap": "", 
            "tokenomics": "", 
            "calendar": "", /* Recognize different calendar booking services "calendly.com", "cal.com", "calendar.app.google.com" as "calendar" */ 
            "other": [ 
                /* For links that do not fit into standard categories, include them in an array "other" If a title or description is included (e.g., in markdown links or preceding the link), include them as objects with "link" and "title"; if no title is provided, include the link as a string.  */ 
                "", /* Links with no recognizable title */ 
                { 
                    "link": "", 
                    "title": "" 
                } /* Links with title */ 
            ] 
        }, 
        "socials": { 
            "x": "", /* When twitter.com, convert to x.com */ 
            "linkedin": "", 
            "github.com": "", 
            "discord": "", /* Domain "discord.gg" */ 
            "telegram": "", /* Domain "tg.me" */ 
            "youtube": "" /* Domains "youtube.com" & "youtu.be"*/ 
        }, 
        "team": [ 
            /* Array of team members with "name", "role", "linkedin", "x" */ 
            { 
                "name": "", 
                "role": "", 
                "linkedin": "", 
                "x": "" 
            }, 
            { <team_member_n> } // Add other team members
        ] 
    }, 
    { <company_n> } // Add additional companies
] 
```
**Notes: ** 
If only one valuation is mentioned, set it as the FDV and leave “valuation” empty.
If a token ratio is provided (e.g., “1:2”), calculate the equity valuation as FDV divided by the ratio (e.g., FDV $40m with 1:2 ratio implies equity valuation of $20m). Set “valuation” to the equity valuation and “FDV” to the token valuation.
Phrases like “raising at” or “@” indicate the valuation (e.g., “Raising 3M@50M token” means raising $3M at a $50M FDV).
Include any vesting terms mentioned under “vesting_terms” (e.g., “12 month cliff, 24 month linear vesting” or “5% unlocked at launch, 95% 3 months cliff and 12 months linear unlock”).
Extract links in any format (plain text or markdown). Remove any markdown formatting from the extracted data.
Never include any explanations or additional text.

**JSON Formatting Details: **
Only include keys in the JSON object if their values are defined and non-empty. Do not include any keys with empty strings, empty arrays, or empty objects.
Ensure that the output is valid JSON under any circumstances.
Never include the “json ” markdown formatting in the output.
The output MUST start with “[” and end with “]”.
Keys should always match those provided above.
Do not output comments—these are only for instructions.
Include validity checks to make sure characters are properly escaped.
If there are multiple companies, return an array (e.g., [ { <company 1> }, { <company 2> }, {  } ]).
Always return an array, even if there is only one company (e.g., [ {  } ]).
Output the JSON as a single line string with no line breaks.

The prompt is a work in progress and was iteratively improved using ChatGPT o1-preview. More testing and improvements are required to improve accuracy and analysis.

What's Next: Deal Analysis

This could be a valuable tool for our team, as we receive dozens of deal blubs weekly. Being able to capture deal information for later processing easily will be a huge time saver.

The long-term goal is to build a model that generates a report and provides scoring along relevant metrics. This could be achieved by capturing the deck from DocSend and scanning its content with ORC (in progress). The extracted data could also be used alongside the website, blog, documentation, and social media content.

If you want to contribute to this project, DM me.