The metagenomic paradigm allows for an understanding of the metabolic and functional potential of microbes in a community via a study of their proteins. The substrate for protein identification is either the set of individual nucleotide reads generated from metagenomic samples or the set of contig sequences produced by assembling these reads. However, a read-based strategy using reads generated by Next Generation Sequencing (NGS) technologies, results in an overwhelming majority of partial-length protein predictions. A nucleotide assembly-based strategy does not fare much better since metagenomic assemblies are typically very fragmented and also leave a large fraction of reads unassembled. Here we present an assembly strategy that allows for the reconstruction of complete protein sequences directly from the NGS reads. We also introduce a novel strategy for the accurate identification of homologs of a query protein sequence in a metagenomic database of short protein fragments.