Before the advent of ReLocator, methods for estimating sample geographic locations were mainly divided into two categories, both with obvious flaws.
The first category is unsupervised genotype clustering or dimensionality reduction techniques. These methods jointly analyze genetic data from known and unknown source samples, then assign unknown samples to the locations of known samples that belong to the same genotype cluster or principal component space region. However, this approach requires an additional mapping step to convert genotype clusters into geographic coordinates, which may produce unreasonable results if the unknown sample is a hybrid or from an unsampled reference population.
The second category is explicit model-based methods, such as SPASIBA and SCAT. These methods use a two-step process: first, estimate a smooth frequency map of each allele's spatial variation based on the genotypes of individuals at known locations; then predict the position of a new sample by maximizing the likelihood of observing a specific combination of alleles at a given location. These methods usually assume that allele frequencies follow a specific form of function (e.g., Gaussian function), have high computational costs, and impose strict assumptions on the model.