Transcription

SUPPLEMENTARY INFORMATIONCOVseq is a cost-effective workflow for mass-scaleSARS-CoV-2 genomic surveillanceMichele Simonetti, Ning Zhang, Luuk Harbers, Maria Grazia Milia, Thi Thu HuongNguyen, Silvia Brossa, Magda Bienko, Anna Sapino, Antonino Sottile, Valeria Ghisetti &Nicola Crosetto1. Supplementary Figurespg. 22. Supplementary Methodspg. 113. Supplementary Tablespg. 194. Supplementary Notespg. 235. Supplementary Referencespg. 26

1. Supplementary FiguresSupplementary Fig. 1. Theoretical SARS-CoV-2 genome coverage by COVseq. (a)Distributions of distances between consecutive MseI and NlaIII recognition sites along theSARS-CoV-2 genome. Theoretically, 97.1% of the fragments generated by cutting with bothMseI and NlaIII are less than 300 base pairs (bp). (b) Theoretical COVseq coverage of the mostfrequent SARS-CoV-2 SNVs detected worldwide1 and of the SNVs in the recently emergedUK lineage2, using single-end (SE) sequencing with different read lengths, as shown on the topright. The y-axis indicates the number of MseI or NlaIII recognition sites that are closer thanthe read length from each SNV.2

Supplementary Fig. 2. SARS-CoV-2 genome coverage by the multiplexed PCR assaydeveloped by the US CDC and NEBNext library preparation kit applied to one RNA sampleextracted from the supernatant of a SARS-CoV-2 viral culture (see Methods). (a) Number ofsequencing reads mapping at each base along the SARS-CoV-2 genome. The regions of theSARS-CoV-2 genome displaying the highest depth of coverage correspond to the regionscovered by more amplicons, as shown in Fig. 1a. (b) Whole genome and S region coverage atvarious minimum sequencing depths, for the same NEBNext library shown in (a). (c) Inversecorrelation between the cycle threshold determined by RT-PCR and the number of reads, for30 (n) samples (samples 1–30 in Supplementary Table 4) sequenced by NEBNext (seeMethods). M, millions. Each dot represents a sample. Dashed red line: linear regression fit. R,Pearson’s correlation coefficient. P, t-test, two-tailed.3

Supplementary Fig. 3. Validation of COVseq by a standard library preparation method(NEBNext). (a) Number of SNVs detected in 30 samples (samples 1–30 in SupplementaryTable 4) sequenced by COVseq, NEBNext or both. (b) Ideogram showing the regions in theSARS-CoV-2 genome (‘dark regions’) that are more than 300 nt away from the closest MseIor NlaIII site and therefore cannot be covered by SE150. Adding an extra restriction enzyme(BfaI) is predicted to result in coverage of these ‘dark regions’ using SE150. Vertical black barsindicate BfaI recognition sites. The colors of the genes in the SARS-CoV-2 genome are thesame as in Fig. 1a.4

Supplementary Fig. 4. (a-d) Inverse correlation between the cycle threshold determined byRT-PCR and the total number of reads obtained by COVseq, in the reference (Ref) and threereplicate (Rep) libraries prepared from 55 SARS-CoV-2 positive left-over RNA samples(samples 31–85 in Supplementary Table 4). M, millions. Each dot represents a sample.Dashed red line: linear regression fit. R, Pearson’s correlation coefficient. P, t-test, two-tailed.5

Supplementary Fig. 5. (a-d) Percentage of sequencing reads aligned to the SARS-CoV-2genome, human genome (Hs), other genomes (Other) or unmapped, in the reference (Ref) andthree replicate (Rep) libraries prepared from 55 SARS-CoV-2 positive left-over RNA samples(samples 31–85 in Supplementary Table 4). WF, COVseq workflow (see Methods). (e) Cyclethreshold (Ct) values of the 55 samples shown in (a-d). Sample IDs are the same as inSupplementary Table 4.6

Supplementary Fig. 6. (a-c) Correlation between the breadth of coverage in the reference (Ref)and three replicate (Rep) libraries prepared from 55 SARS-CoV-2 positive left-over RNAsamples (samples 31–85 in Supplementary Table 4). Each dot represents one sample. Dashedred lines: linear regression fit. R: Pearson’s correlation coefficient. P: t-test, two-tailed. (d-h)Venn diagram showing the extent of overlap between the SNVs identified in samples with Cteither 30 or 35 included in the same Ref and Rep libraries shown in (a-c). (i-j) Barplotshowing the number of SNVs shared between different Rep libraries.7

Supplementary Fig. 7. Proof-of-principle experiment using the COVseq workflow #3described in the Methods and MseI and NlaIII in combination. (a) Breadth of coverage of theSARS-CoV-2 reference genome at varying sequencing depths, for two different input amountsof synthetic SARS-CoV-2 RNA (5,000 and 10,000 copies). The dashed red line represents thetheoretical coverage at 1 . (b) Same as in (a), but for the S gene.8

Supplementary Fig. 8. Locations of BfaI, MseI, NlaIII recognition sites (vertical colored bars)along the genome of (a) the H1N1 strain of Influenza type A, (b) Influenza type B and (c)Dengue virus. Reference NCBI accession numbers are indicated near each plot. Combining9

these enzymes with virus-specific multiplexed PCR assays (see Supplementary Table 6),would expand the applications of COVseq in genomic epidemiological surveillance.10

2. Supplementary MethodsStep-by-step COVseq protocolREAGENTS- Absolute Ethanol (VWR, cat. no. 20816.367)- Nuclease-Free Water (Thermo Fisher Scientific, cat. no. AM9932)- Mineral oil (Sigma, cat. no. M5904)- Random hexamers (50 μM) (Thermo Fisher Scientific, cat. no. N80800127)- dNTPs (10 mM) (Thermo Fisher Scientific, cat.no. R0191)- NEBNext Q5 Hot Start HiFi PCR Master Mix (NEB, cat.no. M0543L)- Primer pools 1,2,3,4,5 and 6 (IDT custom @ 50 μM) (see Supplementary Table 1)- MseI (NEB, cat.no. R0525L)- NlaIII (NEB, cat.no. R0125L)- CutSmart buffer (NEB, cat. no. B7204S)- T4 DNA Ligase (Thermo Fisher Scientific, cat. no. EL0011)- COVseq oligonucleotide adapters (see Supplementary Table 2)- UltraPure BSA (50 mg/ml) (Thermo Fisher Scientific, cat. no. AM2616)- ATP Solution (100 mM) (Thermo Fisher Scientific, cat. no. R0441)- dNTPs (25mM) (Thermo Fisher Scientific, cat.no. R1121)- MEGAscript T7 Transcription Kit (Thermo Fisher Scientific, cat. no. AM1334)- DNase I, RNase-free (Thermo Fisher Scientific, cat. no. AM2222)- RA3 adaptor and RTP, RP1 and RPI primers (custom-made by Integrated DNA TechnologiesInc. based on the sequences in the TruSeq Small RNA Library Preparation kit, Illumina)- RNaseOUT Recombinant Ribonuclease Inhibitor (Invitrogen, cat. no. 10777-019)- T4 RNA ligase 2, truncated (NEB, cat. no. M0242L)- SuperScript IV Reverse Transcriptase (Thermo Fisher Scientific, cat. no. 18090050)- NEBNext UltraII Q5 PCR Mastermix (NEB, cat. no. M0544S)- Agencourt RNAClean XP with Scalable throughput (Beckman Coulter, cat. no. A63987)- Agencourt AMPure XP (Beckman Coulter, cat. no. A6388§)- Qubit RNA BR Assay Kit (Thermo Fisher Scientific, cat. no. Q10211)- Qubit dsDNA BR Assay Kit (Thermo Fisher Scientific, cat. no. Q32850)- Qubit dsDNA HS Assay Kit (Thermo Fisher Scientific, cat. no. Q32851)11

- Bioanalyzer High Sensitivity DNA Kit (Agilent, cat. no. 5067-4627)CONSUMABLES- Eppendorf DNA LoBind microcentrifuge tubes 0.5 ml (Sigma, cat. no. EP0030108035250EA)- Eppendorf DNA LoBind microcentrifuge tubes 1.5 ml (Sigma, cat. no. EP0030108051250EA)- Sapphire Filter tips, low retention (Greiner Bio-One, cat. no. 771265, 773265, 738265,750265)- microTUBE-50 AFA Fiber Screw-Cap (25) (Covaris, cat. no. 520166)- # 96-well plates (Thermo Fisher Scientific, cat. no. 4316813)- # 384-well plates (Thermo Fisher Scientific, cat. no. 4483320)- QubitTM Assay Tubes (Thermo Fisher Scientific, cat. no. Q32856)- Bioanalyzer High-sensitivity DNA kit (Chips) (Agilent, cat. no. 5067-4626)EQUIPMENT- Incubator (for example, Binder incubator, Model KB 53 or Boekel Scientific InSlide Out, cat.no. 05-450-50)- Tabletop centrifuge (for example, Eppendorf Microcentrifuge 5424)- I-DOT One (Dispendix GmbH, Stuttgart, Germany)- Thermoshaker (for example, Eppendorf Thermomixer Compact)- PCR thermocycler (for example, Biometra TRIO)- Sonication device (for example, ME220 Focused-ultrasonicator, Covaris)- DynaMag -2 Magnet (Thermo Fisher Scientific, cat. no. 12321D)- Qubit 2.0 Fluorometer (Thermo Fisher Scientific, cat. no. Q32866)- Bioanalyzer 2100 (Agilent, cat. no. G2943CA)PROCEDUREDAY 1First strand synthesis1.Mix the following components:RNA50 μM random hexamers10 mM dNTPs5 μl1 μl1 μl12

Nuclease-Free Water6 μl2.Incubate for 5 min @ 65 C3.Place the tube immediately on ice for 5 min4.Add the following components:5x SSIV buffer0.1 M DTTRNaseOUTSSIV Reverse Transcriptase enzyme5.Perform the following steps in a PCR thermocycler with the lid set @ 85 C:1. 25 C2. 50 C3. 85 C4. 4 C6.4 μl1 μl1 μl1 μl10 min10 min10 minHoldAdd 1 μl RNase H to the tube and incubate for 20 min at 37 CMultiplex PCRNote: Prepare primers as 50 μM primer stocks. Add an equal amount of each 50 μM primerstock to six different Eppendorf tubes labeled as pool 1,2,3,4,5 and 6. Prepare 10 μM workingconcentration by diluting each pool 1:5 with Nuclease-Free Water.7.Mix the following components:NEBNext Q5 Hot Start HiFi PCR Master MixNuclease-Free WaterPrimer pool 1,2,3,4,5 or 6 (10 μM)4x SYBR Green15 μl9.2 μl1.8 μl1 μl8.Add 3 μl of cDNA to each tube9.Perform the following steps in PCR thermocycler with the lid set @ 105 C:1.98 C 30 sec2.98 C 15 sec3.65 C 5 minGOTO step 2, 40 times4.4 CHoldDNA purificationNote: This is an optional step. For the continuation even unpurified material can be used.13

10. Pool 20 μl from each of the six amplicon pools into a 1.5 ml LoBind tube11. Add 1x vol/vol ratio of AMPure XP beads pre-warmed at room temperature12. Mix thoroughly and incubate for 10 min at room temperature13. Place the sample on a magnetic stand14. Incubate for at least 5 min until the liquid appears clear15. Remove and discard the supernatant16. Wash the beads twice with 200 μl of freshly prepared 80% ethanol17. Air-dry the beads at room temperatureNote: do not dry the beads for more than 5–8 min, since this may result in low DNA yield18. Remove the sample from the magnetic stand19. Resuspend the beads in 80 μl of nuclease-free water20. Incubate for 2 min at room temperature21. Place the sample back on the magnetic stand22. Incubate for at least 5 min until the liquid appears clear23. Transfer the supernatant to a new 1.5 μl DNA LoBind tube24. Check the library concentration using Qubit dsDNA BR kitNote: Samples can be stored for long time @ -20 CDAY 2Note: To process multiple samples in parallel, we performed all reactions until IVT in 384-wellplates. We used the I-DOT One nanodispensing device (Dispendix GmBH) to reduce volumesof each reagent. For this step one can use unpurified material coming from the multiplex PCRafter pooling an equal volume from each of the six pools. Other dispensing systems may alsobe used; however, volumes might have to be adjusted depending on the technical specificationsof each instrument.DNA digestion25. Dispense manually 5 μl of mineral oil per well in the targeted region of 384-well plates26. Dispense 50 nL of purified or 100 nL of non-purified PCR amplicons27. Dispense Nuclease-Free Water to a total volume of 350 nLNote: From now, after dispensing for each step, we shake the plate in a ThermoMixer at 1,000rpm for 1 min and centrifuge at 3,220 g for 5 min before each incubation28. Mix the following components:NlaIII enzyme50 nL14

MseI enzyme10x CutSmart Buffer50 nL50 nL29. Dispense 150 nL per well30. Perform the following incubation steps:1. 37 C2. 65 C3. 4 C1h20 minHoldLigation of COVseq adapters31. Dispense 150 nL of 33 nM COVseq adaptor of NlaIII per well32. Dispense 150 nL of 33 nM COVseq adaptor of MseI per well33. Dispense 700 nL of ligation mix per well:Nuclease-free water10x T4 ligase bufferATP 10 mMBSA 50 mg/mlT4 standard ligase250 nL150 nL120 nL30 nL150 nL34. Perform incubation at 22 C for 1 h followed by inactivation at 70 C for 5 min35. Dispense manually 5 μl of Nuclease-Free Water per well36. Pool the content of multiple wells manually in a 1.5 mL or 5 mL eppendorf tube37. Spin down the tube and carefully remove the upper phase containing mineral oilNote: Pooling step can be performed by centrifuging the plate upside down into a collectionplate placed at the bottom at 800 rpm for 1min.DNA cleanup38. Add 1.2x vol/vol ratio of AMPure XP beads pre-warmed at room temperature39. Mix thoroughly and incubate for 10 min at room temperature40. Place the sample on a magnetic stand41. Incubate for at least 5 min until the liquid appears clear42. Remove and discard the supernatant43. Wash the beads twice with freshly prepared 80% ethanol (the ethanol should be enough tocover the beads)44. Air-dry the beads at room temperatureNote: do not dry the beads for more than 5–8 min, since this may result in low DNA yield15

45. Remove the sample from the magnetic stand46. Resuspend the beads in 10 μl of nuclease-free water47. Incubate for 2 min at room temperature48. Place the sample back on the magnetic stand49. Incubate for at least 5 min until the liquid appears clear50. Transfer the supernatant to a new 1.5 μl DNA LoBind tube51. Check the library concentration using Qubit dsDNA HS kitNote: Samples can be stored for long time @ -20 CIn vitro transcription52. Start with 8 μl from the previous step53. Add the following reagents on ice:rATP rUTP rGTP rCTP*8 μl10x T7 polymerase buffer2 μlT7 polymerase1.5 μlTMRNaseOUT Recombinant Ribonuclease Inhibitor0.5 μl*Prepared from separate rNTP solutions provided with the MEGAscript T7 Transcription Kit54. Incubate for 14 hours at 37 C in a PCR thermocycler with the lid set @ 70 CNote: IVT can also be performed at 37 C for 2 hours to save time.DAY 3RNA cleanup55. Add 1 μl of DNase I (RNase-free) to the IVT product56. Incubate for 15 min @ 37 C57. Bring up the volume to 50 μl by adding 29 μl Nuclease-Free Water, then mix with 90 μl(1.8x vol/vol) of RNAClean XP beads pre-warmed at room temperature58. Mix thoroughly and incubate for 10 min at room temperature59. Place the sample on a magnetic stand60. Incubate for at least 5 min until the liquid appears clear61. Remove and discard the supernatant62. Wash the beads twice with 200 μl of freshly prepared 70% ethanol63. Air-dry the beads at room temperatureNote: do not dry the beads for more than 5–8 min, since this may result in low DNA yield64. Remove the sample from the magnetic stand65. Resuspend the beads in 9 μl of nuclease-free water16

66. Incubate for 2 min at room temperature67. Place the sample back on the magnetic stand68. Incubate for at least 5 min until the liquid appears clear69. Transfer 8.8 μl of supernatant to a new 0.5 μl DNA LoBind tube70. Check the library concentration with 1 μl using Qubit dsDNA BR kitRA3 adapter ligation71. Add 1 μl of 10 μM RA3 adapter to 7.8 μl obtained after RNA cleanup72. Incubate for 2 min @ 70 C in a PCR thermocycler, then immediately place sample on ice73. Add 3.2 μl of the following mix:RNA ligase buffer1.2 μlTMRNaseOUT Recombinant Ribonuclease Inhibitor1 μlT4 RNA ligase truncated1 μl74. Incubate for 2 hours @ 25 C in a PCR thermocycler with the lid set @ 30 CReverse transcription75. Add 2 μl per sample of RTP primer76. In a PCR thermocycler, incubate for 2 min @ 70 C77. Quickly transfer the sample to ice78. Add 11 μl of the following mix:5x SSIV buffer5 μl12.5mM dNTPs1 μl0.1M DTT2 μlTMRNaseOUT Recombinant Ribonuclease Inhibitor1 μlSuperScript IV reverse transcriptase2 μl79. Incubate for 20 min @ 50 C followed by inactivation for 10 min @ 80 C in a PCRthermocycler with the lid set @ 80 CLibrary indexing and amplification80. Add 16 μl per sample of the desired indexed Illumina primer81. Add 359 μl of the following mix:Nuclease-free water143 μlTMNEBNext Ultra II PCR Master Mix200 μlRP1 primer16 μl82. Divide the final mix in 8 PCR tubes with each containing 50 μl83. In a PCR thermocycler perform the following cycles:1.98 C30 sec17

2.98 C 10 sec3.60 C 30 sec4.65 C 45 secGOTO step 2, 10 times565 C 5 min64 CHoldNote: 10 PCR cycles are used for an input to the IVT of 200 ng. Please, adjust PCR cyclesaccordingly to the input to the IVT step.Final library cleanup84. Pool the 8 tubes for each sample and add 0.8x vol/vol ratio of AMPure XP beads prewarmed at room temperature85. Mix thoroughly and incubate for 10 min at room temperature86. Place the sample on a magnetic stand87. Incubate for at least 5 min until the liquid appears clear88. Remove and discard the supernatant89. Wash the beads twice with 1 ml of freshly prepared 80% ethanol90. Air-dry the beads at room temperatureNote: do not dry the beads for more than 5–8 min, since this may result in low DNA yield91. Remove the sample from the magnetic stand92. Resuspend the beads in 50 μl of nuclease-free water93. Incubate for 2 min at room temperature94. Place the sample back on the magnetic stand95. Incubate for at least 5 min until the liquid appears clear96. Transfer the supernatant to a new 1.5 μl DNA LoBind tube97. Check the library concentration using Qubit dsDNA HS kit98. Check the fragment distribution on a Bioanalyzer 2100 using DNA HS chipNote: Libraries can be stored for long time @ -20 C18

3. Supplementary TablesSupplementary Table 1. List of primers used in the CDC SARS-CoV-2 multiplexed PCRassay. Due to its large size, the table is provided as a separate Excel file.Supplementary Table 2. Summary of sequencing results. Due to its large size, the table isprovided as a separate Excel file.Supplementary Table 3. List of oligonucleotides that can be used to prepare COVseq adapters.Due to its large size, the table is provided as a separate Excel file.Supplementary Table 4. List of samples and corresponding Ct values. AS: samples collectedat the ‘Amedeo di Savoia’ Hospital in Turin, Italy during the Phase I of the 2020 pandemic(diagnostic samples). CCI: samples collected at the Candiolo Cancer Institute in Turin, Italyduring the Phase II of the pandemic (screening samples).Target region and corresponding Ct valueSample IDInstitutionSampling 19/151523AS08.04.202031/272719

I02.11.20201515/1620

02727/2885CCI02.11.20202020/21Supplementary Table 5. List of reagents and relative costs for sequencing SARS-CoV-2samples using COVseq vs. three commercially available library preparation kits. Three separatelists are provided for COVseq, based on the three workflows discussed in the SupplementaryNote 1. Due to its large size, the table is provided as a separate Excel file.Supplementary Table 6. Available multiplexed PCR assays for WGS of other viruses inaddition to SARS-CoV-2.AssayTargetWeblinkCDC Influenza SARS-CoV-2SARS-CoV-2, Influenza A andhttps://www.fda.gov/media/139743/download(Flu SC2) Multiplex AssayInfluenza B virusesCDCDengue lthcare-multiplex l#anchor 1556657754547Supplementary Table 7. List of reference sequences used for alignment and phylogeneticanalyses.GenomeReferenceSARS-CoV-2NC 045512.2HumanGRCh38H1N1, Influenza type ANC 026431.1 - NC 026438.1Influenza type BNC 002204.1 - NC 002211.1DengueNC 001474.2AdaptersDefault FastQ-Screen referenceArabidopsisDefault FastQ-Screen referenceDrosophilaDefault FastQ-Screen referenceE Coli.Default FastQ-Screen reference21

LambdaDefault FastQ-Screen referenceMitochondriaDefault FastQ-Screen referenceMouseDefault FastQ-Screen referencePhiXDefault FastQ-Screen referenceRatDefault FastQ-Screen referenceRRNADefault FastQ-Screen referenceVectorsDefault FastQ-Screen referenceWormDefault FastQ-Screen referenceYeastDefault FastQ-Screen reference22

4. Supplementary NotesCost analysisTo demonstrate the cost effectiveness of COVseq, we examine three different COVseqworkflows and compare them with three widely used commercial kits for preparing librariesdirectly from SARS-CoV-2 RNA (TruSeq Stranded Total RNA Library Prep kit, Illumina cat.no. 20020596) or from purified amplicons (NEBNext Ultra II FS DNA Library Prep Kit, NEBcat. no. E7805S; Nextera XT DNA Library Preparation Kit, Illumina cat. no. FC-131-1024).For commercial kits, we assume to purchase the kit with the highest number available of libraryindexes (96 for TruSeq and NEBNext, 384 for Nextera), so that many samples can be sequencedtogether (see considerations on sequencing below). A detailed list of reagents, volumes perreaction, number of reactions and current prices (as of Dec 2020) is available inSupplementary Table 5. For simplicity, we omit the cost of plasticware and other consumables(pipette tips, gloves, etc.) from our analysis. To simulate mass-scale SARS-CoV-2 sequencingperformed by a centralized laboratory or a public health agency, we compute how thecumulative reagent cost grows with increasing the number of samples processed up to 100,000.The simulation does not include sequencing costs, which are discussed separately below. Thesimulation is run through a custom script written in MATLAB, which we make available uponrequest.1. Comparison between different COVseq workflows and formatsWe consider three different scenarios, depending on which step in the COVseq protocol isperformed using standard reaction volumes (microliter range) or nanoliter volumes. For thelatter, we assume to use the I-DOT One contactless nanodispensing device (Dispendix GmBH),which we previously described for high-throughput CUTseq1. However, any other contactlessdevice with similar characteristics should be equally effective. Workflow #1: (i) reverse transcription (RT) and multiplexed PCR are done usingstandard volumes; (ii) purification of the PCR products; (iii) CUTseq using I-DOT. Workflow #2: (i) RT and multiplexed PCR using standard volumes; (ii) no purificationof the PCR products; (iii) CUTseq using I-DOT. Workflow #3: all reactions until IVT are done in nanoliter volumes using I-DOT. Since,due to logistic reasons, we could not transfer our I-DOT machine into a biosafety level2 (BSL-2) laboratory, we could only test this scenario using synthetic SARS-CoV-223

RNA, which can be handled in a standard BSL-1 lab. The results of this test are shownin Supplementary Fig. 6a and b.In all aforementioned workflows, we assume to use a combination of two restriction enzymes,MseI and NlaIII, to digest the SARS-CoV-2 genome. For each workflow, we consider twoscenarios: (i) one case in which 96 samples are pooled into the same library; and (ii) one casein which 384 samples are pooled together. In the former, 96 MseI and 96 NlaIII adapters, eachcontaining a different sample barcode sequence, must be purchased upfront, whereas in thelatter case a total of 384 2 768 adapters are needed. As shown in Fig. 2c, the cumulative costcurves are very similar for workflow #1 and #2, with the latter workflow in combination withthe use of 384 adapters being more cost-effective ( 16.54 per sample to prepare libraries from100,000 samples with workflow #2 and 384 adapters vs. vs. 20.28 per sample using workflow#1 and 96 adapters, respectively). The highest cost effectiveness is achieved with workflow #3and 384 adapters provides ( 2.76 and 1.90 per sample for 10,000 and 100,000 samples,respectively). Thus, if a nanodispensing device can be placed into a BSL-2 environment,COVseq workflow #3 provides the most cost-effective solution for mass-scale sequencing ofSARS-CoV-2 genomes. In principle, a larger repertoire of adapters with different samplebarcodes could be designed in order to pool more samples into the same library. Such highlevel multiplexing is routinely performed in single-cell RNA sequencing assays, such asDROP-seq2. Therefore, we do not see any obstacle to adapting the barcodes used in DROP-seqor other highly multiplexed sequencing assays to expand the repertoire of COVseq adapters.2. Comparison between COVseq and available commercial solutionsAs shown in Fig. 2d, the cumulative cost curves for preparing sequencing libraries from SARSCoV-2 samples using existing commercial kits are all above cumulative COVseq curves,independently of the workflow used. Illumina’s TruSeq kit represents the most expensivesolution, with an estimated cost per sample three orders of magnitude higher than thecorresponding COVseq cost assuming to use workflow #3, 384 adapters and processing100,000 samples ( 3,103 vs. 1.90 per sample). The main reason for this dramatic costdiscrepancy in favor of COVseq (independently of the workflow chosen) is that commercialkits are specifically designed to make a single library from each individual sample, whereas inCOVseq multiple samples are pooled together into the same library, therefore allowing for adrastic reduction of the reagent costs required for library preparation. In principle, ‘nano’versions of commercial library preparation kits could be implemented on I-DOT or other similarnanodispensing devices, in order to reduce the volume, of each reagent, and thus the cost per24

sample. However, even by drastically reducing the volume of reagents used in commercial kits,the cost per sample would remain considerably higher compared to COVseq, due to the factthat only a limited number of libraries (and therefore samples) can be sequenced together in thesame run, as discussed below.3. Considerations on the sequencing platform and number of samples sequenced togetherIn this work, we have sequenced all our samples on the NextSeq 500 platform from Illumina.Therefore, for the purpose of this cost analysis, we assume to use this platform to compare thecost of sequencing COVseq libraries vs. libraries obtained using commercial kits, as discussedabove. In general, the number of samples that can be sequenced together depends: (i) on thedesired depth of coverage per sample, which in turn influences the subsequent ability toconfidently call SNVs and other variants; and (ii) on the number of libraries that can besequenced together. Assuming to use a commercial kit with 96 library indexes, at most 96libraries can be sequenced together in the same sequencing run. Assuming to use the NextSeq500/550 High Output Kit v2.5 (150 cycles) (Illumina, cat. no. 20024907)—which typicallyyields around 400 million reads—this would result in approx. 4 million reads per sample(assuming all the libraries to be well balanced) with a sequencing cost of 40 per sample.Based on our results, this figure is one order of magnitude larger than the minimum number ofreads ( 400,000) required to cover the SARS-CoV-2 genome at an average depth ( 1500 )sufficient to confidently call SNVs. Notably, when preparing libraries from individual samples,the concentration of each library must be accurately quantified before pooling multiple librariesinto the same sequencing run, which further increases the cost per sample. In contrast, thanksto the fact that multiple samples are pooled into the same library, the number of samples thatcan be effectively sequenced together drastically increases in COVseq. For example, using 384COVseq adapters, up to 1,152 samples (3 libraries) can be sequenced together in the sameNextSeq 500 run, yielding 347,000 reads per sample at a cost of 4 per sample. This costcould be further lowered by combining higher multiplexing (using more adapters as discussedabove) with higher throughput platforms, such as Illumina’s NovaSeq 6000. In conclusion,based on our cost estimates, using COVseq workflow #3 with 384 adapters and 1,152 samplessequenced per NextSeq 500 SE150 run would allow sequencing 100,000 samples at a cost ofless than 6. To our knowledge, this makes COVseq the most cost-efficient solution for massscale genomic surveillance of SARS-CoV-2 currently available.25

5. Supplementary References1. Zhang, X. et al. CUTseq is a versatile method for preparing multiplexed DNA sequencinglibraries from low-input samples. Nat. Commun. 10, 4732 (2019).2. Macosko, E. Z. et al. Highly Parallel Genome-wide Expression Profiling of Individual CellsUsing Nanoliter Droplets. Cell 161, 1202–1214 (2015).26

- Agencourt RNAClean XP with Scalable throughput (Beckman Coulter, cat. no. A63987) - Agencourt AMPure XP (Beckman Coulter, cat. no. A6388§) - Qubit RNA