42Genetics Population is a unique tool implementing the well known concept of using intrinsic population observations to improve variant calling. This in context calling, or Consensus Based Call Enhancement (CBCE) improves both sensitivity and precision. The implementation scales linear from a few to many samples and from one to many nodes.
The variants for a population or cohort are stored in a GVM (42Genetics Variant Map). This is a repository that can be used to manage the samples and to search for patterns in the genetic profile. Using the GVM allows you to play with the data. You can take different groups of samples out of a large GVM into a separate GVM to further enhance the quality of the calls in e.g. a phenotype related cohort.
The speed and ease of use of 42Genetics Population frees up time to really focus on the meaning of the data instead of dealing with the data. The storage system is designed and tested to run with consistent file systems such as S3 from Amazon.
The signature of a GVM can be captured in a profile. This profile contains condensed information about the calls and their occurrences in a GVM. Such profile can be used to apply CBCE during germline calling or somatic calling as if the samples were called as part of the GVM cohort the profile was derived from.
Unlike other solutions, 42Genetics Population is incremental. This means that adding 100 samples to a population of 1,000 takes the processing cost of only 100. There is no need to revisit the 1,000 samples that are already part of the GVM. This method allows dealing with thousands of samples in a linear way following a natural production flow.