00       00
  
Bug Report
Report an error
  

AspSeq assembly progress

Raw data

Illumina Paired End:

This was our initial dataset. Sequencing was done on a HiSeq using older chemistry (several years ago). This data has some GC bias issues, due to the chemistry.

Library Name Coverage MeanRL MedianRL Count MeanQ MedianQ MeanGC MedianGC Insert Size
PE150 11.44 101 101 54275444 31.77 36 0.4079 0.4046 150
PE150.1 11.44 101 101 54275444 32.58 36.99 0.4073 0.4043 150
PE150-1 26.41 101 101 125234671 32.4 36.58 0.3859 0.3861 150
PE150-1.1 26.41 101 101 125234671 30.58 35.96 0.3864 0.3861 150
PE300 21.42 101 101 101587753 28.92 35 0.3769 0.3762 300
PE300.1 21.42 101 101 101587753 30.78 36 0.3769 0.3762 300
PE650 13.4 101 101 63533295 28.3 35.01 0.3791 0.382 650
PE650.1 13.4 101 101 63533295 30.84 36.02 0.3783 0.3792 650

454 reads

For the initial project, we also generated about 10X 454 data.

Library Name Coverage MeanRL MedianRL Count MeanQ MedianQ MeanGC MedianGC Insert Size
asp201Run1se_1 0.2848 287.5 288.1 474412 31.08 34.85 0.3607 0.3641 NA
asp201Run1se_2 0.2932 303.3 306.4 463040 31.44 35.24 0.3617 0.3651 NA
asp201Run2pe_1 0.2496 292.1 313.5 409339 28.94 30.8 0.3786 0.3695 3000
asp201Run2pe_2 0.2557 298.1 321 410843 29.09 31 0.3795 0.3699 3000
asp201Run3se_1 0.000635 130.8 77.35 2326 23.8 23.93 0.4796 0.4809 NA
asp201Run3se_2 0.0001951 134.8 81.94 693 26.01 25.24 0.4788 0.4833 NA
asp201Run4se_1 0.4005 346.6 404.7 553438 31.07 34.37 0.4061 0.3855 NA
asp201Run4se_2 0.3629 337 390.4 515844 30.53 33.15 0.4113 0.3894 NA
asp201Run5se_1 0.5258 342 372.6 736322 29.61 32.07 0.3464 0.3494 NA
asp201Run5se_2 0.5788 342.5 376.1 809422 30.02 32.65 0.3504 0.3529 NA
asp201Run6se_1 0.3801 355.2 390.7 512577 30.39 33.66 0.3441 0.3474 NA
asp201Run6se_2 0.4044 364.2 402.2 531805 30.64 34.06 0.3444 0.3473 NA
asp201Run7se_1 0.3812 346 385.1 527682 29.98 32.57 0.3487 0.3521 NA
asp201Run7se_2 0.3919 348.8 388 538231 29.9 32.4 0.3494 0.3526 NA
asp201Run8se_1 0.3791 307.3 329.5 590877 28.68 30.66 0.349 0.3519 NA
asp201Run8se_2 0.44 306.5 328.9 687637 28.53 30.35 0.3482 0.3512 NA
GWDW1HR01 0.7442 533.4 583.6 668353 27.6 28.59 0.3646 0.3641 NA
GWDW1HR02 0.7451 541.6 595.3 658997 27.71 28.77 0.3641 0.3639 NA
GWLD3LU01 0.7276 524.9 568.8 663912 28.57 30.15 0.3643 0.3644 NA
GWLD3LU02 0.6257 483.7 521 619620 27.83 28.78 0.3667 0.3663 NA
GWLD5AX01 0.365 447.2 489.3 390891 27.44 28.06 0.3703 0.3699 NA
GWLD5AX02 0.5504 498.6 541.1 528744 28.3 29.59 0.3676 0.3674 NA
GWLFG7T01 0.7119 557.3 639.7 611862 28.65 30.44 0.3634 0.3644 NA
GWLFG7T02 0.4897 511.9 582 458205 28.66 30.29 0.3662 0.367 NA

Illumina experimental runs

As part of our collaboration with the Swedish central sequencing facility, they tested new protocols on their HiSeq and MiSeq machines to sequence long overlapping fragments. The fragments here are 450bp in total, 300bp overlapping. The HiSeq machine was run in Rapid Mode.

Library Name Coverage MeanRL MedianRL Count MeanQ MedianQ MeanGC MedianGC "Insert" size
MiSeq-300 18.06 301 301 28740790 33.24 37 0.3499 0.3505 -150
MiSeq-300.1 18.06 301 301 28740790 28.6 36.16 0.3553 0.3525 -150
HiSeq-300 113.1 301 301 1.8e+08 34.05 38 0.3464 0.3488 -150
HiSeq-300.1 113.1 301 301 1.8e+08 28.75 38 0.3491 0.3484 -150
HiSeq-300.2 113.6 301 301 180772624 34.19 38 0.3461 0.3488 -150
HiSeq-300.3 113.6 301 301 180772624 29.01 38 0.3484 0.3455 -150

Jumping reads

To scaffold the initial assemblies, we generated some Mate Pair libraries. With exception to the 10Kb library, these suffer from high PE contamination and overall mediocre quality:

Library Name Coverage MeanRL MedianRL Count MeanQ MedianQ MeanGC MedianGC Insert size
3KbMP 22.15 101 101 1.05e+08 32.66 36.74 0.3632 0.3586 3000
3KbMP.1 22.15 101 101 1.05e+08 32.48 37 0.3635 0.3632 3000
10KbMP 2.811 49 49 27481587 36.76 38.96 0.3653 0.3673 10000
10KbMP.1 2.811 49 49 27481587 32.89 37.02 0.3761 0.3784 10000
5KbMP 2.959 101 101 14031890 34.45 36.4 0.3643 0.3564 5000
5KbMP.1 2.959 101 101 14031890 33.64 36.22 0.3641 0.3564 5000
3KbMP.2 8.774 101 101 41611057 32.31 35.28 0.367 0.3663 3000
3KbMP.3 8.774 101 101 41611057 34.9 38.04 0.3664 0.3663 3000
5KbMP.2 12.01 81.87 93.95 70277953 37.6 39 0.3485 0.3536 5000
5KbMP.3 12.02 81.89 93.96 70277953 37.34 38.96 0.3484 0.3532 5000

Fosmid pools

Recently, we ran a pilot to sequence fosmid pools and fosmid ends, which included ~8x PacBio (filtered subreads) of 5x1000 fosmids as well as fosmid end sequencing, generating jumping libraries with 40Kb insert sizes. Only about 5-10% of the initial fosmid end data maps with expected insert sizes.

Library Name Coverage MeanRL MedianRL Count MeanQ MedianQ MeanGC MedianGC Insert Size
pb_162-1 1.568 5729 5861 131109 43.2 44.59 0.3922 0.3769 NA
pb_162-2 1.726 7931 7964 104220 43.09 44 0.3991 0.3832 NA
pb_162-3 1.668 7749 7742 103112 43.07 44 0.3977 0.3823 NA
pb_162-4 1.64 7518 7466 104519 43.04 44 0.3994 0.3826 NA
pb_162-5 1.302 6992 6707 89210 42.98 44 0.3983 0.3826 NA
FE1 4.918 151 151 15602157 35.96 38 0.4081 0.4159 40000
FE1.1 4.918 151 151 15602157 35.18 38 0.4082 0.4106 40000
FE2 4.968 151 151 15758362 35.95 38 0.4082 0.4106 40000
FE2.1 4.968 151 151 15758362 35.29 38 0.408 0.4106 40000

Genomic PacBio

In addition to the previous PacBio data, we have generated 60X PacBio (filtered subreads)

Library Name Coverage MeanRL MedianRL Count MeanQ MedianQ MeanGC MedianGC
pb_158 60.28 8062 7771 3581635 42.6 43.29 0.3648 0.3539

Assembly statistics

The assembly that is currently in use (Potra v1.1) is a hybrid assembly utilizing the initial Illumina PE data, the 454 data, as well as the Illumina Mate Pair data. Two newer generations of the assembly are listed here. The first is an assembly of the high coverage overlapping Illumina data using DISCOVAR de novo, the second is a PacBio only FALCON assembly. The expected genome size is 479Mbp, as estimated by flow cytometry.

Assembly # Scaffolds N50 Total length
Potra v1.1 204318 44Kbp (Scaffold) 387Mbp
DISCOVAR 44375 31Kbp (Contig) 491Mbp
FALCON 4845 484Kbp (Contig) 477Mbp
Login | Site Map | © 2021 PlantGenIE.org.| All our tools are under MIT License


  • GeneList
      view active genelist () here.
      genelist namegenesrenamedelete
      add empty genelist / save current list / cancel
  • Analysis
  • <