Some Tacit Knowledge in Biotech
Some of my non obvious observations, thoughts, and resources.
TLDR: I think AI does a great job for basics, but it misses some of the more tacit knowledge. This list is a *somewhat* organized list of that information for biotech based on my experience.
This is for the biotech curious who don’t know where to start. Healthcare is an industry supposedly reserved for specialists: MDs and PhDs with years of training doing in depth research. Most investors avoid it like the plague with an “anywhere but healthcare” mandate.
But the biotech bug catches a few of us. We realize the times are tough and most companies fail, but the best companies have performed quite well. We realize it’s possible to predict the future while still investing in life changing therapies. We get a little addicted to the adrenaline of clinical data releases.
But it’s hard because hidden agreements shield people from the real effects and interpretation of news. When Trump says something MFN pricing, who does that actually affect? Is that phase 3 data actually good?
You’re interested in learning more about healthcare/biotech. But you’re not an MD/PhD and don’t know where to start. You’ve started throwing prompts at ChatGPT, tried KOLs (which has issues), and ooo maybe you’ve attempting parsing pubmed. But you’re missing a few pieces when chatting to the specialists because you lack the tacit knowledge they learned through experience, so you defer to their interpretations. This article will detail some of that tacit knowledge. The target audience is new investors and anyone else looking for resources, common red flags, and miscellaneous notes.
The biotech Stack: Some under the Radar Resources.
Base rates
Clinical Data release and Interpretation
The company reported X, why is the stock down/up?
Every Change Matters
Misc Notes
Resources
The resources to use. A few months ago I put together a list of the best resources for biotech investors. Here is the sheet and list of resources: https://x.com/plainyogurt21/status/1842256106808344812
I want to highlight a few that may fly under the radar for new investors (Pubmed is pretty widely known)
FDA drug review documents: The FDA documents all correspondence, including thoughts on safety, efficacy, and trial design. It’s a fascinating window into the brain of the FDA and it's all free. Start downloading documents and toss them into NotebookLM for a deep dive
Insurance documents: Type in “Drug Name Insurance Coverage filetype:pdf” and to surface coverage guidelines. Like Attruby™ (acoramidis) - Prior Authorization/Medical Necessity - UnitedHealthcare Commercial Plans. It’s not comprehensive, but a few google searches gives you a trend.
Medicare: CMS publishes formulary data every month/quarter so you get a sense of tiered coverage for part D medications. The formulary by insurance plan provides some data comparing drugs on required prior auth, cost, etc.
Dep Map: DepMap is a free, easy-to-use tool for exploring cancer dependencies. It helps you understand how specific therapies can target driver mutations in particular cancers. The site has a helpful tutorial, and you can find examples on how to get started.
Pharmagellan: Read it. Read the references. Subscribe to Frank David. The Pharmagellan books are the best introduction to biotech investing and the tricks of the trade. They offer an in-depth exploration of how to model biotech companies and analyze clinical trials. The insights may seem obvious to experienced investors, but honestly, every biotech investor should have this on their shelf. It's no substitute for putting money to work, but it's damn good.
FDA FOIA requests: Request adverse event profiles (without the FAERS delay) for free, plus see who's requesting documents (spoiler: Point72 makes a lot of requests—recently for Winrevair's adverse events). You can also ask for 483 inspection reports, which Redica charges for but are cheaper directly from the site.
FDA Calendar: Click on the website, click on "Add to Calendar," and boom—you've got all the PDUFAs synced to your calendar.
Medicaid Drug pricing: Pharmacy Pricing | Medicaid
Simulating Trial Designs: Trial Design
The best place to start is often with Nature Reviews Disease primers for epidemiology, basic treatment guidelines, pathophysiology.
cBioPortal for a cool Cancer Genetics database.
Every acquisition timeline is public. You know who made what offer.
Those are some of the resources which aren’t obvious immediately
Some base rates
Market modeling
McKinsey White paper (10 years old) on base rates for new market entrants: Pharma’s first-to-market advantage | McKinsey
EUS markets are usually equal to US markets (putting aside major drug pricing changes) in size
Standard assumptions in modeling: 3.5X price to peak sales, peak sales occur 7 years out, approximately 10% WACC, 35% FCF margin. This usually values a phase 3 program at 1.5X Peak sales.
Clinical trial success rates,
Clinical trial success rate is lower in Oncology, higher in rare disease. More detailed data. And Some data on higher success rates for orphan drugs
Costs to run a trial: It costs about 50k per patient to run a trial. Largely dependent on recruiting costs for most small molecule trials. Car-T trials can cost a lot to make the drug too.
Timelines and types of FDA meetings: Typically companies have a 30 day window after any meeting where they know how it went but have to wait for meeting minutes. Sometimes they let things slip. See the Capricor example later in the article.
Clinical Data release and interpretation
Please for gods sake, look at the ACTUAL data. Dig into HOW the endpoints were measured. What was the timepoint? Not just the interpretation.
Here’s Replimune playing with the ORR and CR data. In 2023, they were running a phase 3 trial in Cutaneous squamous cell carcinoma (CSCC). In theory, they had proof of concept phase 2 data in hand, but the actual data didn’t show what they reported. They reported response rate as the “best overall response” but the phase 3 trial measured response rates at 1 year. This means any patient who responded in the first 6 months and then relapsed was positive in the phase 2 but would fail the phase 3
This picture describes a total of 14 patients. 7 have a complete response, 4 have PD, 2 have SD, and 1 has a PR. One of the CRs relapses and is re-initiated on treatment, but they still report a 50% CR rate. In phase 3 that’s not a CR.
So their reported 46% CR rate (7/15) and 9/14 ORR is in reality ~8/15 ORR because only one PR is confirmed. With 15 patients, every patient swings the response rate 7%. Then at the net data cut, they start removing patients from the chart…..
They play the same games here. In the next presentation from August of 2022, they’ve removed the two progressive disease (PD) patients, count patients who relapsed as PR/CR and still inflate the true ORR. The true response rate in a phase 3 would be 10-20% lower (the phase 3 trial failed)
Another example: $PYXS Pyxis Oncology reported Phase 1 data for an ADC last year.
A good discussion on twitter summarized as follows. Pyxis reported phase 1 results for their Antibody-Drug conjugate in a large cohort of patients with multiple tumor types. They saw a few responses in a small cohort of Head and Neck Cancer patients and highlight that as “great data”. It’s a 50% response rate compared to 37% for the best available therapy. Putting aside how it failed in every other tumor type, the data in the head and neck cancer patients is iffy too.
They say one patient with an unconfirmed response scanned after the cutoff confirmed the PR. Maybe nothing, but they should report data for every patient scanned after the cut off.
They select 6 patients from the HNSCC cohort but 8 patients were treated: where are those people who went untreated.
One of the patients “confirmed their response” (i.e. they had 2 scans in a row with a response) AFTER the final data cut. If we’re looking at the successful patients, the company should report on all scans for ALL patients.
The drug may still be active, it can still work. But reporting phase 1 data, especially in Oncology, is ripe for manipulation.
So I need to emphasize: look at the data, not the interpretation. Look deep at each endpoint. Where did the patients drop out? What is the endpoint measuring? Is management gaslighting you
Cadence of Data release
Start trial -> recruit and enroll trial -> interim analysis (may be reported -> topline data release (press release) -> further details at a conference (6 months after initial release) -> publication 6-12 months after conference (usually paywalled) -> incorporated into guidelines ~1 year out -> open access publication and included in systematic reviews, etc.
Read the Appendix for any publication. Check any data files (and every hidden sheet in them).
Index to the trial, not the drug:
What’s the efficacy of X drug is meaningless in my opinion. We ask what’s the efficacy of X Drug in Y trial? Each trial is run with different inclusion, exclusion, outcomes, timelines, etc. thus, every trial will report different results. Further, multiple updates for a single trial are released at multiple timepoints (see above). When researching a drug, index to the trials, not the drugs.
Open Label Data
Some trials are run “open label” which means both the researchers and the participants know they’re getting the drugs at an individual level. The sponsors don’t know the details but can get a feel from the researchers if the trial is going well.
But take open label with a grain of salt. We design placebo-controlled trials because we need to know if the drugs better than doing nothing. Open label trials don’t do that. They will be manipulated and expect regression to the mean, specifically on the
What do the patients look like?
I think it’s really important to contextualize any number. Response rates, overall mortality, etc, often obfuscate how patients look and feel in a trial. It’s easy to compare response rates between trials and that’s important, but that’s not a drug is used.
Let’s use lymphodepletion as an example. Lots of cell therapy use “lymphodepletion” to wipe out the patient’s cells before giving them the drug. In theory, “2 weeks in a hospital” is swept under the rug because patients
Dose-Response is really important.
People underestimate the importance of knowing HOW a drug works. The point of early phase trials is to validate the proof of concept for a drug. It should work exactly how you designed it so we should see the efficacy increase as we increase the dose. That means the data isn’t just an out
Biomarkers vs functional endpoints:
Biomarkers can be useful. Useful biomarkers are often a direct proxy for efficacy in a phase 3. Early stage functional outcomes 1are easy to manipulate but they’re often required for FDA approval. Often, the best programs show early benefit on proven biomarkers with trends on outcomes before running a phase 3 to demonstrate outcome benefits
A few other flags from clinical data
BICR vs Investigator assessed response rate -> BICR is based on solely imaging, BICR required for the FDA. These often converge over time, but investigator assessed progression can skew data positive.
Intent to Treat vs Per protocol: Are they measuring the patients who actually got treatment?
Trial Protocol: Did they change it multiple times?
Censoring: why would people drop out? Is this imbalanced?
Is the analysis pre specified or post hoc?
The company reported X, why is the stock down/up?
I see this question a lot. Fundamentally, stocks move on expectations, so if expectations were higher than results……Here’s a few common reasons YOUR expectations may differ from OTHERS leading to a mismatch.
The drug was approved, why is the stock down? Approval was expected and thus priced in. They need to demonstrate the drug can be commercially successful before getting credit.
My company reported clinical results, why is the stock down?
The trial didn’t succeed: First off, any post hoc, secondary analysis, subgroup shenanigans are all the same: the trial failed. If the headline isn’t “Phase 3 hit stat sig”, that’s a failed trial,
The trial was statistically successful but the standard of care is better. The onus is on a new drug to prove it’s better than any available competition and uptake relies on being better than standard of care. For example, Belimumab, a drug for kidney transplant ran their phase 3 against cyclosporine but failed commercially because the standard of care was actually tacrolimus.
Competitive programs show better data. The trial is statistically significant AND clinically meaningful but a competitive program has some differentiation (efficacy, safety, etc) which drastically shrinks the market. VERA is a good example (*Footnote, put aside the data mITT vs PP). The data is meaningfully better than Standard of Care but pales in comparison to the crowded competitive market
The data was consensus: Everyone expected positive results. Investing is a game of expectations and risk. A highly derisked program reading out expected results doesn’t shift investor opinions.
The data doesn’t move the needle: A clinical data readout should derisk a development program. If a readout doesn’t increase future estimated probability of success, the update isn’t very helpful
Your company needs cash. A company should have 1-2 years of cash in the back heading into any clinical readout. Without that buffer, they have to raise
What about that partnership? Investors could not care less. Partnerships are a serious stock killer in the near term because they signal less chance of acquisition and investors never assign any value to them. It’s validation from larger pharma, but the partnership itself doesn’t excite public investors.
The drug is part of the fast track/priority review/breakthrough designation programs: These are merely FDA programs and do not signal deep understanding of clinical data. Fast track means more interaction with the FDA. Breakthrough designation includes all benefits of fast track with a little more interaction. A priority review is at the time of application. In general, people don’t care about these. It’s a signal in the right direction but other than speeding up the process a little bit, I wouldn’t put much stock into them.
The same principles apply in the opposite direction too…”X reported bad data: why is it up?” As a former teacher once told my class: “You Failed at a Slower Rate than expected”. That can be a win.
Every change can matter
Example 1: Anaptys Bio Stock doubles based on a few comments from Eli Lilly
Every wording change, power point presentation change, website difference, piece of commentary matters.
I can’t make this more clear than ANAB mid 2024. The stock goes from 20 to 40 based on a few bullish comments from Eli Lilly at a conference.
Then Eli Lilly discontinues their program later that year and the stock is down 75% from the peak
Another example: DYNE therapeutics Q2 2024 changes their wording around the FDA and the stock falls
The company was originally set to get accelerated approval explicitly for the DM1 tretament.
“In addition to a clinical update in May 2024, the Company also noted that based on recent dialogue with the Center for Drug Evaluation and Research (CDER) division of the U.S. Food and Drug Administration (FDA), Dyne continues to pursue an accelerated approval pathway for DYNE-101 in DM1, including leveraging splicing as a potential surrogate biomarker.”
But they change their wording in Q3 2024:
“Dyne continues to pursue expedited approval pathways globally for DYNE-101 utilizing splicing as a surrogate endpoint.”
Subtle, but expedited approval pathways are not Accelerated approval. That change matters, especially considering their competition is using a (better) functional endpoint for approval. Investors are definitely reading into that wording change as meaningful.
Example 3: Capricor let the meeting minutes slip.
Capricor therapeutics is developing a therapy for duchenne muscular dystrophy. The drug failed a few early trials but the company had an upcoming phase 3 trial in 2024 and was planning to file for approval based on the phase 3 data.
Originally, the company met with the FDA to discuss BLA plans in May 2024. The mention receiving details late Q2.
Then they get the pre-BLA (drug application) meeting in late August.
Expectations were BLA filing based on the HOPE3 data, expected to fail. However, management lets it slip before the meeting minutes are official: maybe the FDA opens the window to alternative paths. They revealed the flexibility from the FDA without an official press release. Only in the earnings call.
And voila in September.
The stock is up 200%. I cannot emphasize how much these little updates matter. There are no coincidences.
These little updates happen at every interval: Earnings, transcripts, conferences, presentations, publications, private calls. A slight change in data can be positive or negative and every professional investor is reading into the details.



Miscellaneous notes + Memes
Liver toxicity - What is this?
I was confused about this for a long time but it’s the bane of early stage drug development. Simply put, ALT/AST elevations represent liver cell damage. Drugs can damage the liver in a million ways so I won’t belabor them all. A few basic principles: anything looking for an electron causes damage. Hy’s Law kills a drug program, don’t care if it’s 1/2 patient. Helpful FDA Guidance, overview article
Genetic evidence
I underestimated the value of genetic evidence. One of the easiest ways to validate a hypothesis is human genetic evidence. A drug may test what happens if we restore function a protein? There are humans out there who have that mutation. We can use their mutation and outcomes for a directionally correct. Genetic validation is huge.
Thinking like a doctor
Doctors follow a set process to prescribe medications and have to be FORCED off their standard of care to change. Clinical guidelines are how doctors prescribe medications, but they also care about ease of access. They want something to work, something they don’t have to spend more time educating patients on
Medical school is 2 years of basic science, 2 years of clinical training. Once they graduate, they are “Dr. XYZ”. Doctors then spend 3-6 years in “residency” which is intense training to specialize in a field. Not much time for investing in stocks during this journey. Nowhere do they train in how to pick stocks, successful drugs, drug development, insurance, financial modeling. They are trained to treat patients and that’s it.
Who is actually using the drug?
Speaking of treating patients, some specialists are more comfortable with safety risks while others are more cautious. One key split is the difference between dermatology (think eczema/dupixent) and rheumatology (think Humira/rheumatoid arthritis). Rheumatologists are willing to tolerate increased safety issues while dermatologists are scared away by a black box warning (regardless of the real world profile) Pay attention to the end prescriber.
Past success != future success for Management teams.
Biotech requires both skill and luck so prior success does not guarantee future success. “Let’s bring back the gang” has some high profile failures:
https://x.com/adar170/status/1770786922513698946
Some humor: https://www.reddit.com/r/biotech/comments/1ad6ie2/big_pharma_in_a_nutshell/
How do I evaluate a competitive landscape quickly?
Honestly, those fancy tools comparing response rates by drug program, trial, etc are good for business development, but they’re often not very helpful for investors. As an investor, I want to know only a couple things: late-stage drug efficacy, Current standard of care, and unmet needs. We can do it easily with the following
Step 1: find a clinical guideline for the disease of interest, what is currently used? Find the label for any approved drugs and jot down the a) trial design, b) endpoints, c) efficacy d) dosing/administration
Step 2: Go to chatGPT, ask Deep research for a comprehensive overview of late stage drugs and their efficacy.
Step 3: For each drug in phase 3 trials -> evaluate efficacy and safety using any publications from conferences, presentations from the company. Jot down the same info.
This should take you about 1-2 hours and you’ll know approximately the expected topline results for each of the major drugs. This isn’t comprehensive, but we don’t need it to be comprehensive to evaluate competitive efficacy. We know roughly how good a drug should be to be competitive.
Some notes on Mergers and Acquistions
Acquisitions are how biotech works. You either get acquired early or live long enough to become a large cap growth name.
Consensus M and A candidates don’t get taken out
The most common profile is late stage/commercial, best in class drugs. Everyone wants these.
Pharma will buy scarce drugs (not crowded) and companies where they need to “strike now?” asset or the valuation runs away.
Large Funds and Catalyst deserts
You are not accessing the same information as large funds. They meet management teams on demand and talk to researchers, investors, executives all. day. long. Some are running preclinical experiments themselves. In the vein of “every change matters'“, these funds are tracking every change.
But you have an advantage over them: you have time. No one is on your back about returns so you can wait. Take advantage of that flexibility.
Some cleansing memes
Checkout Goose Data
Conclusion
What’s the point of this post? All these lessons are simply my own thoughts for how to evaluate data, research a drug in development, and invest in a biotech company. Again, not financial advice. I hope this helps someone understand a bit more about biotech which I don’t think chatGPT could teach them.
Appendix:
I would love to see someone in the drug development world add their knowledge of drug development. I feel like medicinal chemistry can be similarly opaque compared to interpreting clinical data and anyone willing to publicly share their thoughts is highly valued.
Functional outcome = walk distance, etc based on patients. Biomarkers = obtained through a lab value or other objective measure.
This is an incredible primer. Thanks, Adu!