According to World Health Organization estimates, India will have the greatest number of human immunodeficiency virus (HIV)-infected individuals of any country by the end of this decade (1, 6). High rates of sexually transmitted diseases, rapidly increasing seroprevalence in female commercial sex workers, and inadequate facilities for HIV testing, counseling, and prevention are the major contributing factors in the recent explosive increases in the numbers of HIV infections (5, 6, 24, 29). While antiretroviral drugs have reduced mortality from AIDS in developed nations, their effect will be negligible elsewhere due to their cost. For most communicable diseases, vaccines offer the most cost-effective control strategy. It is likely that development of a vaccine for HIV will require knowledge of the viral variants being transmitted in the target population. Despite India’s impending predominance in the worldwide pandemic, little is known of the genetic diversity of HIV-1 in India. The HIV-1 sequence database is growing exponentially, but the distribution of submitted sequences is not representative of the worldwide picture. Subtype C has been reported in nearly every region affected by HIV-1 (11, 23, 28) and predominates in India, and it also causes 74% of infections in southern Africa and 96% of infections in northern Africa (11, 18, 32). Given the combined population of India and the other regions affected, subtype C is likely to be the most commonly transmitted HIV-1 subtype worldwide. In contrast, 7% of the available HIV-1 sequence data is from subtype C-infected individuals (37), and of the 46 completely sequenced HIV-1 genomes (excluding multiple derivatives of HIV-1LAI), only two are of subtype C, one from a 1992 Brazilian sample and the other from a 1986 Ethiopian sample (37). In November 1997, an analysis of cross-clade epitope variation (9) excluded the C clade from evaluation of p24gag epitopes because of a lack of sequence data, whereas there was sufficient data to analyze subtypes A, B, D, F, G, and H (no HIV-1 harboring a subtype E gag gene has been found). Further sequence data from subtype C is needed, but the past approach of generating data from small subgenomic amplicons is no longer sufficient. Recent developments have made full-genome characterization of HIV-1 isolates both important and feasible. First, the recognition of intersubtype recombination in a significant proportion of HIV-1 sequences (44, 45) has led to detection of mosaic genomes in many regions of the world affected by multiple subtypes (14, 17, 31). Subtypes A, B, and C in India have been reported (4, 22, 30, 31, 59), but mosaic HIV-1 there has not been reported. The existence of such recombinants makes characterization of variants by analyzing subgenomic segments incomplete. Second, immune responses to vaccines based on single genes such as env have been limited (13), and attention is being shifted toward multivalent vaccines that incorporate other gene products. Third, interactions among discontinuous regions of the genome, such as between the long terminal repeat (LTR) and pol (26), can be detected only when such regions can be analyzed from the same template. In an effort to characterize subtype C virus genomes being transmitted currently in India, viral isolates were obtained from individuals with seroincident infections in India. Three of the isolates (collected in 1994 and 1995) were known to be non-syncytium inducing (NSI) and therefore resembled viruses transmitted through unprotected sexual contact, which account for 75 to 85% of new infections (2, 15, 61). These isolates were cloned, and nearly full-length genomic sequences were determined. Detailed sequence analysis was performed, as was an analysis of variation in characterized cytotoxic T lymphocyte (CTL) epitopes.