An Evidence-Based Review of SARS-CoV-2 Circulation Prior to the December 2019 Wuhan Outbreak
Executive Summary
This report provides a comprehensive, evidence-based analysis of the global circulation of Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) prior to the first officially recognized cluster of COVID-19 cases in Wuhan, China, in December 2019. While the Wuhan outbreak marks the definitive point of epidemiological detection, a significant body of scientific evidence, corroborated by intelligence assessments, indicates that the virus's initial emergence in humans and subsequent period of low-level, cryptic circulation began earlier, likely in the final quarter of 2019.
The analysis synthesizes three primary lines of inquiry. First, molecular phylogenetics and viral evolution studies establish a robust temporal framework. The high genomic similarity of SARS-CoV-2 to bat-hosted coronaviruses, particularly RaTG13, confirms its zoonotic origins. However, the notable evolutionary distance between these known animal viruses and SARS-CoV-2 signifies that a direct progenitor has not yet been identified, pointing to a complex evolutionary history that may have involved recombination in one or more intermediate animal hosts. Molecular clock analyses, which are independent of any specific origin hypothesis, consistently place the most recent common ancestor of human SARS-CoV-2 strains in a window between October and November 2019.
Second, retrospective surveillance of human clinical samples has yielded suggestive, though not uniformly conclusive, evidence of early circulation. A large-scale study of archived U.S. blood donations found SARS-CoV-2-reactive antibodies in samples collected as early as mid-December 2019. A clinical case in France was retrospectively identified via RT-PCR in a patient hospitalized on December 27, 2019, with no travel history to China. While these findings are significant, they are subject to important caveats, including the potential for antibody cross-reactivity and the risk of laboratory contamination, which temper their definitive value.
Third, environmental forensics through wastewater-based epidemiology (WBE) provides some of the most compelling physical evidence. Retrospective analyses of archived sewage samples have detected SARS-CoV-2 RNA in Northern Italy in mid-December 2019 and, most credibly, in Southern Brazil in late November 2019. The Brazilian finding, in particular, was validated by an independent laboratory and confirmed with genomic sequencing, making it one of the strongest pieces of evidence for the virus's presence outside of China before the Wuhan outbreak was reported.
Conversely, several large-scale studies that retrospectively tested thousands of respiratory samples from symptomatic patients in Europe and North America during the pre-pandemic period found no evidence of SARS-CoV-2. These negative findings are crucial, as they suggest that any early circulation was highly sporadic and at a prevalence too low to be detected through conventional clinical surveillance.
In synthesis, the evidence converges on a nuanced conclusion. The first recognized public health event—the outbreak that triggered a global response—was unequivocally in Wuhan in December 2019. However, the virus's biological emergence and initial spillover to humans almost certainly occurred earlier, no later than November 2019. This initial phase was characterized by cryptic, low-level transmission that appears to have been geographically diffuse, with credible evidence pointing to its presence in South America and Europe before the end of 2019. This critical distinction between a virus's initial emergence and its subsequent epidemiological detection is fundamental to understanding the true origins of the COVID-19 pandemic.
Introduction: Establishing the Baseline of a Global Pandemic
The official and universally accepted history of the COVID-19 pandemic begins in December 2019. During that month, public health authorities in Wuhan, the capital city of China's Hubei province, identified a cluster of pneumonia cases of unknown etiology.1 Many of the initial cases were epidemiologically linked to the Huanan Seafood Wholesale Market, although subsequent analysis has suggested that human-to-human transmission may have begun even earlier.2 The causative agent was rapidly identified as a novel betacoronavirus, later named Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2).1 The global trajectory of the crisis accelerated swiftly thereafter. On January 30, 2020, the World Health Organization (WHO) declared the outbreak a Public Health Emergency of International Concern (PHEIC), and by March 11, 2020, with the virus spreading rapidly across continents, the WHO officially characterized COVID-19 as a pandemic.2
Framing the Central Question
This established timeline represents the point of epidemiological detection—the moment when a pathogen causes a sufficiently large or severe cluster of illness to be recognized by public health surveillance systems. It does not, however, necessarily represent the point of viral emergence, which is the initial zoonotic spillover event where the virus first successfully infected a human. An inherent lag almost always exists between these two points in the timeline of a new disease. This report is dedicated to a rigorous, evidence-based investigation of that lag period. The central query it seeks to answer is: What credible scientific evidence exists for the presence and circulation of SARS-CoV-2, in any location globally, prior to the recognized Wuhan outbreak of December 2019?
The answer to this question is of profound importance. Understanding the true timeline of the virus's early, undetected spread is critical for refining epidemiological models, assessing the effectiveness of early public health responses, and preparing for future pandemics. It informs our understanding of how a novel pathogen can move silently across the globe before it is ever identified.
The Unresolved Origin Debate as Context
The search for evidence of early circulation is inextricably linked to the ongoing and unresolved debate over the ultimate origin of SARS-CoV-2. The international scientific and intelligence communities have coalesced around two plausible, yet unproven, hypotheses. As summarized in a declassified report from the U.S. Office of the Director of National Intelligence (ODNI), these are: natural exposure to an infected animal and a laboratory-associated incident.4 The natural origin hypothesis posits a spillover event from an animal reservoir, possibly through an intermediate host, to humans. The laboratory-associated incident hypothesis suggests the first human infection could have resulted from activities at the Wuhan Institute of Virology (WIV), such as experimentation or sample handling.4
This report does not aim to resolve this debate. However, the existence of these two competing narratives provides essential context. A natural origin could have occurred anywhere within the geographic range of the relevant animal species, whereas a laboratory-associated incident would, by definition, be localized to Wuhan. Despite this geographical divergence, both scenarios must ultimately align with the biological evidence of the virus itself and its evolutionary timeline.
Crucially, the U.S. Intelligence Community (IC) assessment provides a temporal benchmark that is consistent with scientific estimates. The IC assesses that the initial, small-scale human exposure to SARS-CoV-2 occurred "no later than November 2019".4 This assessment immediately establishes a documented gap between the first likely infection and the first recognized cluster of cases. Furthermore, the IC assesses that Chinese officials "probably did not have foreknowledge that SARS-CoV-2 existed" before it was isolated by WIV researchers following the public recognition of the outbreak.4 This suggests that the early circulation was cryptic not only to the world but likely to authorities in China as well.
The convergence of scientific inquiry and intelligence analysis on a pre-December 2019 timeline frames the entire investigation that follows. The period between a potential first infection in October or November 2019 and the recognized outbreak in December represents a critical window of silent transmission. The remainder of this report will systematically evaluate the direct and indirect evidence that has been uncovered from this period, seeking to piece together a more complete picture of the pandemic's clandestine beginnings.
Section 1: The Progenitor Virus: Evolutionary Timelines and Animal Reservoirs
To understand when and where SARS-CoV-2 was first found, it is essential to first understand where it came from in an evolutionary sense. The genetic code of the virus itself contains a historical record of its ancestry, allowing scientists to trace its lineage back through animal populations and estimate the timeframe of its emergence. This molecular evidence provides the most fundamental and origin-agnostic baseline for assessing all other claims of early detection.
1.1. The Bat Coronavirus Connection
There is broad and enduring scientific consensus that SARS-CoV-2 is a zoonotic virus that ultimately originated in bats.6 This conclusion is based on deep-seated evidence from virology and genomics. SARS-CoV-2 belongs to the Betacoronavirus genus and, more specifically, the Sarbecovirus subgenus.10 This subgenus is a viral family known to be particularly rich in viruses hosted by various species of bats, especially horseshoe bats (Rhinolophus genus), which have a vast and overlapping geographic range across Southeast Asia and Southern China.10
The most direct evidence for this connection came with the publication of the genomic sequence of a bat coronavirus named RaTG13. This virus was identified from a sample collected in 2013 from a horseshoe bat (Rhinolophus affinis) in Yunnan Province, China, and its sequence was held at the Wuhan Institute of Virology.6 When compared to the first sequences of SARS-CoV-2 from patients in Wuhan, RaTG13 was found to share a remarkable 96.2% whole-genome identity.11 This made it, at the time of its discovery, the closest known relative to the pandemic virus. Subsequent wildlife surveillance has identified other related bat viruses. One notable example is RmYN02, discovered in Rhinolophus malayanus bats in Yunnan province in 2019, which shares 93.3% overall identity with SARS-CoV-2 but, critically, has a 97.2% identity in the 1ab gene, the longest and most conserved region of the coronavirus genome.6 These discoveries firmly established bats as the natural reservoir for the viral lineage that gave rise to SARS-CoV-2.
1.2. The Evolutionary Gap and the Search for a Direct Ancestor
While a 96.2% genomic similarity between RaTG13 and SARS-CoV-2 is striking, the remaining 3.8% difference is profoundly significant from an evolutionary perspective. In a genome of approximately 30,000 nucleotides, this equates to over 1,100 different nucleotide positions.7 Given the known mutation rates of coronaviruses, this genetic distance represents decades of evolutionary divergence, not months or a few years. Phylogenetic analyses estimate that the most recent common ancestor of SARS-CoV-2 and RaTG13 existed roughly 40 to 70 years ago.10
This finding is of paramount importance: it proves conclusively that RaTG13 is a relatively distant cousin of SARS-CoV-2, not its direct parent or progenitor.7 The existence of this "evolutionary gap" implies that there must be an undiscovered, more closely related bat coronavirus (or a chain of them) that is the true, direct ancestor of SARS-CoV-2. This "missing link" virus would be expected to share >99% genomic identity with SARS-CoV-2.4 Despite extensive and ongoing surveillance of wildlife populations, no such virus has been found to date. This failure to identify a direct progenitor in an animal reservoir remains one of the most significant knowledge gaps in understanding the virus's origins and is a central point of contention in the origin debate.4
1.3. The Intermediate Host Hypothesis and Recombination
The evolutionary gap between known bat viruses and SARS-CoV-2 strongly suggests the involvement of one or more intermediate animal hosts. This is a common pathway for zoonotic spillovers. For the original SARS-CoV outbreak in 2002-2004, bats were the natural reservoir, but the virus passed through and adapted in civet cats sold at live animal markets before infecting humans.6 A similar multi-stage process is considered highly plausible for SARS-CoV-2.
The genetic makeup of SARS-CoV-2 is not uniform in its similarity to its relatives; it appears to be a mosaic, pointing toward a complex history of recombination, a process where different coronaviruses infecting the same host can swap genetic segments. This has led to several key hypotheses regarding intermediate hosts:
Pangolins: Early in the pandemic, Malayan pangolins (Manis javanica) emerged as a strong candidate for an intermediate host. SARS-CoV-2-like coronaviruses were discovered in pangolins that had been seized during anti-smuggling operations in southern China.6 While the overall genome of these pangolin coronaviruses is less similar to SARS-CoV-2 than RaTG13 is (around 90-91% identity), a functionally critical part of the virus showed a much closer relationship. The Receptor-Binding Domain (RBD)—the specific part of the spike protein that directly latches onto the human ACE2 receptor to initiate infection—of the pangolin coronavirus is remarkably similar to the RBD of SARS-CoV-2. In fact, it is more similar than RaTG13's RBD.11 This has led to a prominent hypothesis that the direct ancestor of SARS-CoV-2 may have been a recombinant virus, possibly a bat virus that acquired a pangolin-like RBD in an intermediate host, thereby gaining a high affinity for human cells.6
Other Potential Hosts and the Furin Cleavage Site: Another key feature of the SARS-CoV-2 spike protein is its polybasic furin cleavage site, a short sequence of amino acids (PRRA) at the junction of its S1 and S2 subunits.16 This site is cleaved by the human enzyme furin, a process that dramatically enhances the virus's ability to enter human cells and is a major contributor to its high transmissibility.10 This feature is notably absent in RaTG13, the pangolin coronaviruses, and is generally rare among bat-hosted Sarbecoviruses.6 However, furin cleavage sites have evolved independently in other coronavirus lineages, including some found in rodents.10 This has led to a hypothesis that the progenitor of SARS-CoV-2 may have passed through a rodent species, where it acquired this critical feature through recombination or convergent evolution.17 Other animals known to be highly susceptible to SARS-CoV-2, such as raccoon dogs, minks, and deer, have also been considered as potential intermediate hosts where such evolutionary events could have occurred.8
This mosaic-like genetic structure strongly indicates that the emergence of SARS-CoV-2 was not a simple, single jump from one animal to a human. It was likely the culmination of a multi-step evolutionary process involving different viral lineages mixing in one or more intermediate animal species, producing a novel recombinant virus uniquely adapted for efficient human-to-human transmission.
1.4. Molecular Clock Projections of the First Human Infection
The most objective and origin-agnostic evidence for the timing of the pandemic's start comes from molecular clock analysis. This technique uses the genetic sequences from early patient samples to estimate a timeline of viral evolution. By cataloging the genetic differences between various early strains and applying a known average mutation rate for coronaviruses, scientists can extrapolate backward in time to calculate the date of their most recent common ancestor (TMRCA). This TMRCA represents the hypothetical ancestral virus from which all subsequent human infections descended, and its estimated date provides the most likely timeframe for the initial successful and sustained spillover into the human population.
Multiple independent phylogenetic studies, conducted by different research groups using different datasets and methodologies, have converged on a remarkably consistent timeframe. These analyses consistently place the TMRCA for SARS-CoV-2, and thus the likely start of sustained human transmission, between late October and late November 2019.2 This scientific estimate provides a robust temporal "hard boundary" that any plausible origin scenario must accommodate. It aligns perfectly with the U.S. Intelligence Community's assessment that the first human infection occurred no later than November 2019.4 This molecular data serves as a critical scientific benchmark against which all claims of earlier human circulation must be evaluated. Any claim of widespread human infection prior to this October-November 2019 window would be extraordinary and would require a radical revision of our understanding of the virus's evolutionary rate.
Section 2: Retrospective Human Surveillance: The Search for "Patient Zero"
While molecular evolution provides a theoretical timeline for the virus's emergence, the search for direct physical evidence in humans has focused on retrospective analysis of clinical samples collected before the pandemic was recognized. These studies, which hunt for genetic or immunological traces of the virus in archived biological material, offer the tantalizing possibility of identifying the very first cases. However, such forensic investigations are fraught with methodological challenges and are highly susceptible to errors that can lead to false conclusions. The burden of proof for any claim of pre-pandemic infection is therefore exceptionally high.
2.1. Serological Traces in U.S. Blood Donations (December 2019)
One of the most significant studies in this area was published in the journal Clinical Infectious Diseases. Researchers retrospectively tested a repository of 7,389 archived serum samples from routine blood donations collected by the American Red Cross across nine states between December 13, 2019, and January 17, 2020.18 The goal was to search for antibodies against SARS-CoV-2, which would indicate a past infection in the donor.
Methodology: Recognizing the high potential for false positives in serological testing, the study employed a rigorous, multi-tiered validation algorithm designed to maximize specificity.18
Initial Screen: All samples were first screened using a sensitive pan-immunoglobulin (pan-Ig) enzyme-linked immunosorbent assay (ELISA) that detects any antibodies (IgG, IgM, IgA) binding to the S1 subunit of the SARS-CoV-2 spike protein.
Confirmatory Assay: Samples that were reactive on the initial screen were then re-tested with a second, more specific confirmatory assay.
Neutralization Assays: A subset of the confirmed-reactive samples underwent further testing with microneutralization (MN) and surrogate neutralization assays. These are functional tests that go beyond simple antibody binding; they measure whether the antibodies present can physically block the virus from infecting cells, providing stronger evidence of a genuine and specific immune response.18
Key Finding: Of the 7,389 total donations, 106 (1.4%) were found to be reactive on both the initial and confirmatory ELISA tests.18 Most critically, 39 of these reactive samples came from donations collected between December 13 and December 16, 2019, from residents of California, Oregon, and Washington. This was weeks before the first officially confirmed U.S. case was reported on January 20, 2020. Further functional testing was highly informative: of the 106 reactive samples, 84 demonstrated neutralizing activity in the MN assay, indicating the presence of functional, virus-blocking antibodies.18
Conclusion and Caveats: The authors carefully concluded that their findings "suggest that SARS-CoV-2 may have been introduced into the United States" earlier than previously recognized.18 This study is significant because of its scale and methodological rigor. However, it is crucial to understand its inherent limitations, which prevent it from being considered definitive proof of infection:
Cross-Reactivity: The primary concern with any serological study is the potential for cross-reactivity. The human population is regularly exposed to several endemic common cold coronaviruses (e.g., OC43, HKU1). Antibodies generated against these seasonal viruses can sometimes cross-react with antigens from a novel coronavirus like SARS-CoV-2, leading to a false-positive result. While the use of neutralization assays reduces this risk, it does not eliminate it entirely.
Lack of "True Positive" Confirmation: The gold standard for confirming an infection is a positive molecular test (like RT-PCR) that detects the virus's genetic material. Without a corresponding positive PCR test from these blood donors at the time of their infection, the antibody results alone cannot be considered "true positives".18
Low Signal Strength: The study noted that many of the reactive samples had signal-to-cutoff ratios that were near the assay's threshold for positivity, making their interpretation ambiguous.18 Only a few samples showed the very strong antibody reactivity typically seen after a confirmed infection.
In summary, the U.S. blood donation study provides plausible and suggestive evidence of sporadic SARS-CoV-2 exposure on the U.S. West Coast in mid-December 2019. It represents a significant data point but falls short of being incontrovertible proof due to the inherent limitations of retrospective serology.
2.2. A Retrospectively Identified Clinical Case in France (December 2019)
A more direct, though highly controversial, claim of early circulation came from a hospital north of Paris. In a report published in the International Journal of Antimicrobial Agents, a team of physicians described the retrospective identification of a SARS-CoV-2 infection in a patient hospitalized a full month before France's first official cases were announced.19
The Case: The patient was a 42-year-old fishmonger, born in Algeria and living in France, who was admitted to an intensive care unit on December 27, 2019.21 He presented with severe symptoms, including haemoptysis (coughing up blood), fever, and a dry cough. A chest scan revealed bilateral ground-glass opacities, a pattern that would later become recognized as a hallmark of severe COVID-19 pneumonia.21 At the time, tests for influenza and other common respiratory pathogens were negative, and his illness was diagnosed as pneumonia of unknown origin. He was treated and eventually recovered. Crucially, the patient had no recent travel history and no known link to China.21 Months later, in April 2020, as the pandemic was raging in Paris, the physicians retrieved his stored, frozen respiratory sample from December and tested it using RT-PCR for SARS-CoV-2. The test returned a positive result.19
Significance: If accurate, this finding would be monumental. It would establish not only that SARS-CoV-2 was present in Europe in 2019 but also that local, community transmission was occurring, completely untethered from known travel from China. This would fundamentally rewrite the early history of the pandemic's spread in Europe.
Scientific Critique and Controversy: The report was met with immediate and intense scrutiny from the international scientific community. The finding, while striking, is widely considered to be unproven due to several critical methodological concerns that were highlighted by independent experts 23:
High Risk of Laboratory Contamination: The single most significant criticism is the risk of cross-contamination. The retrospective test was performed in April 2020, at a time when French laboratories were processing an enormous volume of samples from COVID-19 patients, many with extremely high viral loads. Testing a single, four-month-old sample, likely containing a very low amount of degraded RNA, in such an environment carries an exceptionally high risk of contamination from a contemporary positive sample, from the lab environment, or from the testing reagents themselves.23
Lack of Definitive Confirmation: The viral load in the sample was reportedly low, which is common in older samples but also makes contamination a greater concern. The fatal flaw of the study, in the view of many experts, was the failure to perform genomic sequencing on the amplified viral material. Sequencing would have been the definitive confirmatory step. It could have proven that the genetic material was indeed SARS-CoV-2 and, by analyzing its sequence, could have shown whether it was a unique, early lineage consistent with a December 2019 infection, or if it matched the strains circulating in Paris in April 2020, which would have been a smoking gun for contamination. Without this sequencing data, the result remains unconfirmed.23
Epidemiological Inconsistency: A major question raised by critics was epidemiological. If the virus was genuinely causing severe, ICU-level pneumonia in Paris in late December, it is difficult to explain why a larger, noticeable outbreak did not erupt until late February or early March. While sporadic, dead-end transmission chains are possible, the presence of a case severe enough to require intensive care would suggest a more established community transmission that should have become apparent sooner.23
These retrospective clinical studies are a powerful illustration of the principle that extraordinary claims require extraordinary evidence. While they hint at the possibility of a much earlier and more geographically diffuse start to the pandemic, the methodological hurdles are immense. The U.S. serology data is suggestive of a faint signal amidst significant noise, while the French clinical case is a single, uncorroborated data point that cannot escape the profound shadow of potential contamination. The geographic diversity of these signals, however, does raise the compelling possibility that if cryptic circulation was occurring in late 2019, it may have been seeded to multiple international locations via asymptomatic travelers before any alarms were raised anywhere.
Section 3: Environmental Forensics: Detecting Viral Echoes in Wastewater
While the search for early infections in individual humans is fraught with challenges, another scientific discipline has provided a powerful, population-level tool: wastewater-based epidemiology (WBE). This environmental forensic approach hunts for the genetic "echoes" of a virus in a community's sewage, offering a unique and non-invasive window into public health dynamics.
3.1. Principles of Wastewater-Based Epidemiology (WBE)
The foundation of WBE for COVID-19 is the biological fact that SARS-CoV-2 is shed in the feces of infected individuals.24 This shedding occurs in both symptomatic and asymptomatic cases and can begin several days before the onset of symptoms, continuing for days or weeks.25 The viral RNA, protected within the viral particle, travels from household toilets through the sewer system to a centralized wastewater treatment plant (WWTP).28
By collecting composite samples of raw, untreated sewage at the inlet of a WWTP, scientists can effectively capture a pooled biological sample from the entire population served by that plant, which can be thousands or even millions of people.24 Using molecular techniques like quantitative reverse transcription polymerase chain reaction (RT-qPCR), laboratories can then detect and quantify the concentration of SARS-CoV-2 RNA in the sewage.30
This method has several profound advantages for public health surveillance. It is independent of clinical testing availability or individual healthcare-seeking behavior. Because it captures shedding from asymptomatic and pre-symptomatic individuals, WBE often serves as a leading indicator, with viral concentrations in wastewater rising several days to a week before an increase in clinically reported cases is observed.26 This makes it a powerful early warning system. For the purposes of investigating the pandemic's origins, WBE offers the potential to detect the presence of the virus in a community even when the number of infected individuals is very small and clinically invisible.
3.2. Evidence of Early Circulation in Europe (Late 2019)
Several research groups in Europe have applied these techniques to archived wastewater samples that were originally collected for other projects before the COVID-19 pandemic was known.
Italy: A seminal study conducted by researchers at the Italian National Institute of Health (Istituto Superiore di Sanità) provided some of the earliest and most compelling environmental evidence. The team, led by Giuseppina La Rosa, analyzed 40 archived influent wastewater samples that had been collected from five WWTPs in the northern Italian cities of Milan, Turin, and Bologna between October 2019 and February 2020.29 They used two distinct and robust molecular methods for confirmation: a nested RT-PCR and a real-time RT-qPCR assay. Their analysis yielded positive results for SARS-CoV-2 RNA in samples collected from both Milan and Turin on December 18, 2019.29 This finding was significant not only for its early date—more than two months before Italy's first officially detected case in late February 2020—but also because it suggested simultaneous, independent circulation of the virus in two major, geographically distinct metropolitan areas.
Spain: Research from Spain has produced two separate and starkly different claims of early detection.
A Plausible Finding (January 2020): A research group at the University of Barcelona, as part of a broader WBE surveillance project, analyzed archived samples from two large WWTPs in the city. They reported a confirmed positive detection of SARS-CoV-2 RNA in a sewage sample collected on January 15, 2020.33 This was 41 days before the first clinical case of COVID-19 was officially reported in Barcelona. This finding is epidemiologically plausible and consistent with the expected timeline of cryptic circulation preceding a recognized outbreak.
A Highly Controversial Finding (March 2019): In a later preprint, the same research group made the extraordinary claim of having detected SARS-CoV-2 RNA in a single archived wastewater sample from March 12, 2019—more than nine months before the Wuhan outbreak.34 This claim, if true, would completely upend the entire established understanding of the pandemic's origin. However, it was met with widespread and severe criticism from the scientific community for major methodological weaknesses.36 The critique centered on several key points: the result was from a single, non-replicated sample; only two of the five PCR gene targets used were weakly positive, while the confirmatory E-gene assay was negative; and most importantly, no genomic sequencing was performed to confirm the identity of the amplified material. Given the immense risk of cross-contamination in a lab handling highly positive 2020 samples, this unconfirmed, anomalous result is almost universally considered by experts to be a laboratory artifact or false positive, not credible evidence of circulation in 2019.36
3.3. Evidence from South America (Late 2019)
Perhaps the most robust and scientifically credible claim of pre-December 2019 circulation outside of China comes from Brazil.
Brazil: A study led by Gislaine Fongaro at the Federal University of Santa Catarina analyzed archived human sewage samples from Florianopolis, a major city and tourist destination in southern Brazil.37 The research team reported detecting SARS-CoV-2 RNA in two independent samples collected on November 27, 2019.37 This date is more than two months before the first confirmed case in the Americas (in the U.S. on January 21, 2020) and three months before Brazil's first official case.
The credibility of this finding is substantially enhanced by the rigorous confirmation methods employed by the researchers. The initial positive RT-qPCR result was subsequently confirmed by an independent laboratory using a different testing system. Critically, the team then performed Sanger sequencing on the amplified genetic material (amplicons) from the S and RdRp genes, which confirmed that the sequence had 100% identity with the corresponding regions of the SARS-CoV-2 reference genome.38 This sequencing step is the gold standard for validation in retrospective WBE, as it definitively rules out false positives from other coronaviruses or laboratory artifacts. This multi-layered confirmation makes the Brazilian finding the strongest piece of direct physical evidence to date for the circulation of SARS-CoV-2 outside of China in late 2019.
3.4. Methodological Challenges and Caveats of Retrospective WBE
The divergent credibility of the Spanish March 2019 claim and the Brazilian November 2019 claim perfectly illustrates the methodological challenges inherent in retrospective WBE. Detecting what is likely a very low concentration of viral RNA in a complex and inhibitory environmental matrix like sewage is a significant technical hurdle.39 This creates a "signal-to-noise" problem where distinguishing a true, faint signal from background noise or contamination is paramount.
Therefore, a clear hierarchy of evidence exists for such studies. The credibility of a finding is directly proportional to the rigor of its confirmation. A single, weakly positive PCR result, without replication or sequencing, is scientifically weak and highly suspect. A result that is confirmed using multiple gene targets, reproduced in an independent laboratory, and ultimately validated by genomic sequencing is scientifically robust and credible. These stringent standards are necessary to prevent the propagation of erroneous conclusions based on laboratory artifacts. The credible findings from Italy and Brazil, located in major international travel and tourism hubs, lend significant support to the hypothesis that the virus was seeded globally via asymptomatic travelers in the fall of 2019, circulating at a low, clinically invisible level for weeks or months before the Wuhan outbreak brought it to the world's attention.
Section 4: The Counter-Narrative: Studies Finding No Evidence of Early Circulation
To achieve a balanced and comprehensive understanding, it is essential to consider not only the studies that have reported evidence of early SARS-CoV-2 circulation but also the significant body of research that has actively searched for such evidence and found none. In scientific inquiry, the absence of evidence, when it has been systematically and rigorously sought, constitutes important evidence in itself. These "negative" studies provide a crucial counter-narrative that helps to constrain the scale and nature of any potential pre-pandemic spread.
The Importance of Negative Results
Numerous research teams around the world have conducted large-scale retrospective analyses of archived clinical samples, primarily respiratory swabs that were collected from patients with influenza-like illness (ILI) during the fall and winter of 2019-2020. The logic behind these studies is straightforward: if SARS-CoV-2 was circulating widely and causing significant respiratory disease before December 2019, it should be detectable in at least some of the thousands of samples taken from symptomatic patients during routine surveillance for influenza and other respiratory viruses. The overwhelmingly negative results from these studies argue strongly against the presence of a large, hidden epidemic during that period.
Summary of Key Negative Findings
Several key studies from different countries have reported a complete absence of SARS-CoV-2 in pre-pandemic clinical samples 40:
Lombardy, Italy: This region in northern Italy became the first major epicenter of the pandemic in Europe. A comprehensive study retrospectively tested 1,581 respiratory samples that had been collected for influenza surveillance between October 2019 and January 2020. Despite the large sample size from what would later become a hotspot, zero of the samples tested positive for SARS-CoV-2 RNA. The authors concluded that their results "do not support evidence of widespread circulation for SARS-CoV-2" in Lombardy during this period.40 This aligns with phylodynamic analyses that estimate the seeding of the virus in Lombardy occurred in early January 2020.40
Switzerland: A similar study in Switzerland retrospectively tested pre-pandemic nasopharyngeal swabs from patients presenting with respiratory symptoms. This investigation also found no evidence of SARS-CoV-2 circulation before the country's first officially identified case.40
Quebec, Canada: In a region of Quebec, researchers analyzed 1,440 archived nucleic acid extracts from respiratory samples collected for influenza testing between January 1 and February 20, 2020. Despite testing over two-thirds of all samples collected in the region during that time, none tested positive for SARS-CoV-2. This led to the conclusion that the virus was likely not circulating in that region prior to late February 2020.40
Nagasaki, Japan: A study in Nagasaki evaluated 182 stored nasopharyngeal swabs from adult outpatients with influenza-like illness during the 2019-2020 flu season. All samples tested negative for SARS-CoV-2, leading the researchers to conclude there was no evidence of large-scale community spread before the first confirmed case in the region.40
Contextualizing the Positive and Negative Findings
At first glance, these robust negative findings from clinical samples might seem to directly contradict the positive findings from wastewater and serological studies. However, they are not necessarily mutually exclusive. A more nuanced interpretation suggests that these two sets of data are actually complementary and, when viewed together, paint a more complete picture of the virus's early behavior.
The clinical studies were designed to detect infections in people who were sick enough with respiratory symptoms to seek medical care. The fact that these large-scale efforts found no virus suggests that SARS-CoV-2 was not a significant cause of clinically apparent respiratory illness in these regions during late 2019. In other words, there was no widespread, symptomatic outbreak.
Wastewater-based epidemiology, on the other hand, is a fundamentally different tool. It is not limited to detecting symptomatic cases. Its strength lies in its ability to detect viral shedding from an entire community, including asymptomatic and pre-symptomatic individuals who would never be captured by clinical surveillance.33 WBE is sensitive enough to detect the presence of just a handful of infected individuals within a population of hundreds of thousands or even millions.
Therefore, the combined evidence from both positive and negative retrospective studies leads to a coherent synthesis: any circulation of SARS-CoV-2 prior to December 2019 was likely highly sporadic, geographically scattered, and at a prevalence far too low to be detected by routine clinical surveillance systems. The virus was present, but it was rare. It was spreading cryptically, primarily through asymptomatic or mildly symptomatic transmission chains, without causing a noticeable wave of severe disease until it reached a critical mass of infections and found favorable conditions for explosive amplification, a threshold that was likely first crossed in Wuhan.
Conclusion: Synthesizing the Evidence for a Pre-Wuhan Timeline
The question of when and where SARS-CoV-2 first began to circulate is one of the most critical and complex issues of the COVID-19 pandemic. A comprehensive review of the available evidence reveals a timeline that is more nuanced than the official narrative of a December 2019 start in Wuhan. While Wuhan is unequivocally the site of the first recognized epidemiological event, the virus's biological journey in humans almost certainly began earlier and was potentially more geographically widespread in its initial, cryptic phase.
Convergence of Evidence
The strongest conclusion that can be drawn from the existing data is that the emergence of SARS-CoV-2 in the human population occurred in the final quarter of 2019. This assessment is not based on a single piece of evidence but on the convergence of three distinct and powerful lines of inquiry:
Molecular Phylogenetics: The virus's own genetic code serves as a molecular clock. Independent analyses of early viral genomes consistently calculate the time to the most recent common ancestor (TMRCA) to be in the October-November 2019 timeframe.2 This provides a robust, evidence-based scientific boundary for the start of sustained human transmission.
Intelligence Assessments: Independent of scientific modeling, the U.S. Intelligence Community, after examining all available sources of information, reached a consonant conclusion, assessing that the first human infection with SARS-CoV-2 occurred no later than November 2019.4 The alignment of these separate analytical domains lends significant weight to this timeframe.
Credible Retrospective Studies: The most scientifically rigorous retrospective studies that have detected physical traces of the virus all point to circulation within or immediately following this projected window. The suggestive serological evidence from the U.S. West Coast dates to mid-December 2019.18 The wastewater evidence from Northern Italy dates to mid-December 2019.32 Most compellingly, the sequence-confirmed wastewater evidence from Brazil dates to late November 2019.37
Weighing the Evidence
Not all claims of early detection are of equal scientific merit. A critical assessment requires establishing a clear hierarchy of evidence based on methodological rigor and the strength of confirmation. The various claims can be categorized as follows:
Highly Improbable: The report of SARS-CoV-2 in Barcelona wastewater from March 2019 is an outlier that contradicts all other genetic and epidemiological data. The lack of replication, weak signal, and failure to perform sequencing confirmation render it scientifically unsubstantiated and highly likely to be a result of laboratory contamination.35
Unproven: The retrospective identification of a clinical case in France from December 2019 is a significant claim, but it remains unproven. The single RT-PCR result on an old sample, performed in a high-contamination environment without the crucial validation of genomic sequencing, means that a laboratory artifact cannot be confidently ruled out.21
Suggestive but Inconclusive: The U.S. serology study detecting antibodies in blood donations from mid-December 2019 is plausible and methodologically sound. However, the inherent and unavoidable possibility of antibody cross-reactivity with other common coronaviruses prevents it from being definitive proof of infection.18
Credible: The wastewater findings from Northern Italy (December 2019) and Florianopolis, Brazil (November 2019) represent the most compelling direct, physical evidence for pre-Wuhan circulation. The Brazilian finding, in particular, stands as the most robust claim due to its confirmation in an independent laboratory and, most importantly, its validation via genomic sequencing.32
The following table summarizes and assesses these key claims:
Location
Date of Sample
Type of Evidence
Key Finding & Methodology
Scientific Assessment & Major Caveats
Florianopolis, Brazil
Nov 27, 2019
Wastewater RNA
Positive RT-qPCR for SARS-CoV-2 RNA, confirmed by independent lab and sequencing 37
Credible: Strongest evidence for early circulation due to robust, multi-layered confirmation.
Milan/Turin, Italy
Dec 18, 2019
Wastewater RNA
Positive results using two different molecular methods (nested RT-PCR and real-time RT-PCR) 29
Credible: Strong evidence of simultaneous circulation in two cities; confirmed by multiple assays.
California/OR/WA, USA
Dec 13-16, 2019
Human Serology
Reactive antibodies detected via multi-step ELISA and functional neutralization assays 18
Suggestive but Inconclusive: Cannot definitively rule out antibody cross-reactivity with other endemic coronaviruses.
Paris, France
Dec 27, 2019
Human Respiratory Swab RNA
Positive RT-PCR result on a single stored sample from a hospitalized patient with no travel history 19
Unproven: High risk of lab contamination during retrospective testing; lacks essential genomic sequencing confirmation.
Barcelona, Spain
Mar 12, 2019
Wastewater RNA
Weakly positive RT-PCR on two of five gene targets in a single, un-replicated sample 35
Highly Improbable: Contradicts all molecular clock data; widely considered a laboratory artifact or false positive due to lack of confirmation.36
Final Assessment
The weight of the evidence firmly separates the timeline of the virus's biological emergence from its epidemiological detection. The first recognized epidemiological event—the large, sustained outbreak that alerted global health authorities—was unequivocally in Wuhan, China, in December 2019. However, the initial spillover and subsequent cryptic circulation of SARS-CoV-2 in humans almost certainly began earlier, during the fall of 2019. This early, pre-pandemic phase was characterized by low-level, sporadic transmission that was likely seeded to multiple international locations via asymptomatic travelers before the end of 2019. This distinction is not academic; it is fundamental to a complete understanding of how a novel pathogen can emerge and silently achieve a global footprint before it is ever given a name.
Works cited
SARS-CoV-2: International Investigation Under the WHO or BWC - PMC - PubMed Central, accessed on October 7, 2025, https://pmc.ncbi.nlm.nih.gov/articles/PMC8850392/
COVID-19 pandemic - Wikipedia, accessed on October 7, 2025, https://en.wikipedia.org/wiki/COVID-19_pandemic
When Did the Pandemic Start and End? - Northwestern Medicine, accessed on October 7, 2025, https://www.nm.org/healthbeat/medical-advances/new-therapies-and-drug-trials/covid-19-pandemic-timeline
Updated Assessment on COVID-19 Origins, accessed on October 7, 2025, https://www.dni.gov/files/ODNI/documents/assessments/Declassified-Assessment-on-COVID-19-Origins.pdf
UNCLASSIFIED UNCLASSIFIED Key Takeaways The IC assesses that SARS-CoV-2, the virus that causes COVID-19, probably emerged and i, accessed on October 7, 2025, https://www.dni.gov/files/ODNI/documents/assessments/Unclassified-Summary-of-Assessment-on-COVID-19-Origins.pdf
SARS-CoV-2 Infections in Animals: Reservoirs for Reverse Zoonosis ..., accessed on October 7, 2025, https://www.mdpi.com/1999-4915/13/3/494
A Critical Analysis of the Evidence for the SARS-CoV-2 Origin Hypotheses | mBio, accessed on October 7, 2025, https://journals.asm.org/doi/10.1128/mbio.00583-23
Animal reservoirs of SARS-CoV-2: calculable COVID-19 risk for older adults from animal to human transmission - PMC, accessed on October 7, 2025, https://pmc.ncbi.nlm.nih.gov/articles/PMC8404404/
SARS-CoV-2 in animals: potential for unknown reservoir hosts and public health implications - PMC, accessed on October 7, 2025, https://pmc.ncbi.nlm.nih.gov/articles/PMC8128218/
Zoonotic origins of COVID-19 - Wikipedia, accessed on October 7, 2025, https://en.wikipedia.org/wiki/Zoonotic_origins_of_COVID-19
Animal Models, Zoonotic Reservoirs, and Cross-Species Transmission of Emerging Human-Infecting Coronaviruses - Annual Reviews, accessed on October 7, 2025, https://www.annualreviews.org/content/journals/10.1146/annurev-animal-020420-025011?crawler=true&mimetype=application/pdf
Analysis of the Genomic Distance Between Bat Coronavirus RaTG13 and SARS-CoV-2 Reveals Multiple Origins of COVID-19 | Request PDF - ResearchGate, accessed on October 7, 2025, https://www.researchgate.net/publication/350977996_Analysis_of_the_Genomic_Distance_Between_Bat_Coronavirus_RaTG13_and_SARS-CoV-2_Reveals_Multiple_Origins_of_COVID-19
Comparative Genomic Analyses Reveal a Specific Mutation Pattern Between Human Coronavirus SARS-CoV-2 and Bat-CoV RaTG13 - Frontiers, accessed on October 7, 2025, https://www.frontiersin.org/journals/microbiology/articles/10.3389/fmicb.2020.584717/full
Synonymous mutations and the molecular evolution of SARS-CoV-2 origins - Oxford Academic, accessed on October 7, 2025, https://academic.oup.com/ve/article/7/1/veaa098/6047024
Bat Coronavirus RaTG13 - News-Medical.Net, accessed on October 7, 2025, https://www.news-medical.net/health/Bat-Coronavirus-RaTG13.aspx
SARS-CoV-2 and bat RaTG13 spike glycoprotein structures inform on virus evolution and furin cleavage effects - PMC, accessed on October 7, 2025, https://pmc.ncbi.nlm.nih.gov/articles/PMC7610980/
Structural basis for mouse receptor recognition by bat SARS2-like coronaviruses | PNAS, accessed on October 7, 2025, https://www.pnas.org/doi/10.1073/pnas.2322600121
Serologic Testing of US Blood Donations to Identify Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2)–Reactive Antibodies: December 2019–January 2020 | Clinical Infectious Diseases | Oxford Academic, accessed on October 7, 2025, https://academic.oup.com/cid/article/72/12/e1004/6012472
SARS-CoV-2 was already spreading in France in late December 2019, accessed on October 7, 2025, https://pubmed.ncbi.nlm.nih.gov/32371096/
SARS-COV-2 was already spreading in France in late December 2019 - ResearchGate, accessed on October 7, 2025, https://www.researchgate.net/publication/341115926_SARS-COV-2_was_already_spreading_in_France_in_late_December_2019
SARS-CoV-2 was already spreading in France in late December 2019 - Bohrium, accessed on October 7, 2025, https://www.bohrium.com/paper-details/sars-cov-2-was-already-spreading-in-france-in-late-december-2019/812640902701907970-11790
Doctors Date First COVID-19 Case in France to Late December | The Scientist, accessed on October 7, 2025, https://www.the-scientist.com/doctors-date-first-covid-19-case-in-france-to-late-december-67510
expert reaction to report of a COVID-19 case in France in December ..., accessed on October 7, 2025, https://www.sciencemediacentre.org/expert-reaction-to-report-of-a-covid-19-case-in-france-in-december-2019/
COVID-19 surveillance in wastewater: An epidemiological tool for the monitoring of SARS-CoV-2 - PMC - PubMed Central, accessed on October 7, 2025, https://pmc.ncbi.nlm.nih.gov/articles/PMC9854263/
Retrospective screening of routine respiratory samples revealed undetected community transmission and missed intervention opportunities for SARS-CoV-2 in the United Kingdom, accessed on October 7, 2025, https://pmc.ncbi.nlm.nih.gov/articles/PMC8459093/
Wastewater Monitoring During the COVID-19 Pandemic in the Veneto Region, Italy: Longitudinal Observational Study - JMIR Public Health and Surveillance, accessed on October 7, 2025, https://publichealth.jmir.org/2025/1/e58862
SARS-CoV-2 RNA in wastewater anticipated COVID-19 occurrence in a low prevalence area - PubMed Central, accessed on October 7, 2025, https://pmc.ncbi.nlm.nih.gov/articles/PMC7229723/
SARS-CoV-2 RNA in urban wastewater samples to monitor the COVID-19 epidemic in Lombardy, Italy (March – June 2020) | medRxiv, accessed on October 7, 2025, https://www.medrxiv.org/content/10.1101/2021.05.05.21256677.full
SARS-CoV-2 has been circulating in northern Italy since December ..., accessed on October 7, 2025, https://pmc.ncbi.nlm.nih.gov/articles/PMC7428442/
SARS-CoV-2 Wastewater Surveillance Testing Guide for Public Health Laboratories - APHL, accessed on October 7, 2025, https://www.aphl.org/aboutAPHL/publications/Documents/EH-2022-SARSCoV2-Wastewater-Surveillance-Testing-Guide.pdf
Assessing SARS-CoV-2 Virus Levels in Sewage | US EPA, accessed on October 7, 2025, https://www.epa.gov/emergency-response-research/assessing-sars-cov-2-virus-levels-sewage
SARS-CoV-2 has been circulating in northern Italy since December 2019: Evidence from environmental monitoring | Request PDF - ResearchGate, accessed on October 7, 2025, https://www.researchgate.net/publication/343673510_SARS-CoV-2_has_been_circulating_in_northern_Italy_since_December_2019_Evidence_from_environmental_monitoring
Time Evolution of Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) in Wastewater during the First Pandemic Wave of COVID-19 in the Metropolitan Area of Barcelona, Spain | Applied and Environmental Microbiology - ASM Journals, accessed on October 7, 2025, https://journals.asm.org/doi/10.1128/aem.02750-20
Sentinel surveillance of SARS-CoV-2 in wastewater anticipates the occurrence of COVID-19 cases | medRxiv, accessed on October 7, 2025, https://www.medrxiv.org/content/10.1101/2020.06.13.20129627v1
SARS-CoV-2 detected in waste waters in Barcelona on March 12, 2019 - UB, accessed on October 7, 2025, https://web.ub.edu/en/web/actualitat/w/sars-cov-2-detected-in-waste-waters-in-barcelona-on-march-12-2019
SARS-CoV-2 in Barcelona sewers – Science Integrity Digest, accessed on October 7, 2025, https://scienceintegritydigest.com/2020/06/27/sars-cov-2-in-barcelona-sewers/
SARS-CoV-2 in human sewage in Santa Catalina, Brazil, November 2019 - ResearchGate, accessed on October 7, 2025, https://www.researchgate.net/publication/342539912_SARS-CoV-2_in_human_sewage_in_Santa_Catalina_Brazil_November_2019
SARS-CoV-2 in human sewage in Santa Catalina, Brazil, November ..., accessed on October 7, 2025, https://www.medrxiv.org/content/10.1101/2020.06.26.20140731v1
Wastewater Surveillance for SARS-CoV-2 in Northern Italy: An Evaluation of Three Different Gene Targets - MDPI, accessed on October 7, 2025, https://www.mdpi.com/2076-2607/13/2/236
No evidence of SARS-CoV-2 circulation before the identification of ..., accessed on October 7, 2025, https://www.researchgate.net/publication/343001792_No_evidence_of_SARS-CoV-2_circulation_before_the_identification_of_the_first_Swiss_SARS-CoV-2_case