Conceived and designed the experiments: PJW LF AB. Performed the experiments: PJW. Analyzed the data: PJW AB. Contributed reagents/materials/analysis tools: PJW LF AB. Wrote the paper: PJW.
PJW, LF, and AB are employed at Foodwiki. However, this employment does not alter the authors' adherence to all the PLoS ONE policies on sharing data and materials.
Indispensible amino acids (IAAs) are used by the body in different proportions. Most animal-based foods provide these IAAs in roughly the needed proportions, but many plant-based foods provide different proportions of IAAs. To explore how these plant-based foods can be better used in human nutrition, we have created the computational tool vProtein to identify optimal food complements to satisfy human protein needs.
vProtein uses 1251 plant-based foods listed in the United States Department of Agriculture standard release 22 database to determine the quantity of each food or pair of foods required to satisfy human IAA needs as determined by the 2005 daily recommended intake. The quantity of food in a pair is found using a linear programming approach that minimizes total calories, total excess IAAs, or the total weight of the combination.
For single foods, vProtein identifies foods with particularly balanced IAA patterns such as wheat germ, quinoa, and cauliflower. vProtein also identifies foods with particularly unbalanced IAA patterns such as macadamia nuts, degermed corn products, and wakame seaweed. Although less useful alone, some unbalanced foods provide unusually good complements, such as Brazil nuts to legumes.
Interestingly, vProtein finds no statistically significant bias toward grain/legume pairings for protein complementation. These analyses suggest that pairings of plant-based foods should be based on the individual foods themselves instead of based on broader food group-food group pairings. Overall, the most efficient pairings include sweet corn/tomatoes, apple/coconut, and sweet corn/cherry. The top pairings also highlight the utility of less common protein sources such as the seaweeds laver and spirulina, pumpkin leaves, and lambsquarters. From a public health perspective, many of the food pairings represent novel, low cost food sources to combat malnutrition. Full analysis results are available online at
The human body requires a small set of indispensible amino acids (IAAs) in a defined proportion. These IAAs are provided in roughly the same proportion in most animal-based foods, but are often found in different proportions in plant-based foods
Broadly, complementation involves consuming two or more foods together to yield an amino acid pattern that is better than the sum of the two foods alone. A simplified example of complementation with three hypothetical amino acids is shown in
Note that the optimial pairing (B) is balanced in that all of the amino acids contribute to the biological value, while the suboptimal pairings (A,C, and D) are unbalanced in that there are excess amino acids (6, 2, and 1 units respectively) that do not contribute to the biological value.
The relative proportion, or pattern, of IAAs required for human health has been the topic of considerable research. IAA patterns commonly discussed include the MIT pattern
A common use for IAA patterns is for calculating the protein digestibility corrected amino acid score (PDCAAS) for particular foods
As an alternative to using PDCAAS, we have developed an analytical tool called vProtein that uses only IAA patterns to evaluate combinations of foods. Using IAA patterns provided by the United States Department of Agriculture standard release 22 (USDA sr22) database, we use vProtein to identify single foods and pairs of foods that yield an IAA pattern most similar to the 2005 dietary reference intake (DRI) pattern
A total of 1251 plant-based foods were identified for use in the subsequent analysis. These foods fell into the following USDA defined food groups: Vegetables and Vegetable Products (559); Legumes and Legume Products (186); Cereal Grains and Pasta (153); Fruits and Fruit Juices (131); Nut and Seed Products (125); Breakfast Cereals (68); Spices and Herbs (19); Fats and Oils (6); and Beverages (4).
Each single food was analyzed to determine the mass of the food required to obtain an equivalent of 1 gram of high BV protein as defined by the 2005 DRI pattern. Of the 1251 foods analyzed, some single foods stood out as unusually balanced or unbalanced based on the mass of excess IAAs. A summary of the most balanced and unbalanced foods is provided in
Wheat based formulated nuts |
Wheat germ |
Quinoa |
Pickle Relish |
Cauliflower |
Garlic |
Cinnamon |
Hummus |
Tomatoes |
Acorns |
|
|
Macadamia nuts | (−) lysine; (−) methionine+cysteine |
Corn based breakfast cereals | (−) lysine; (+) leucine |
Degermed cornmeal | (−) lysine; (+) leucine |
Wakame seaweed | (+) valine; (−) histidine; (−) lysine |
Peeled cucumber | (+) tryptophan; (−) histidine |
Prepared mustard | (−) tryptophan |
Peas with edible pods | (+) valine; (−) histidine |
Apricots | (+) tryptophan; (−) methionine+cysteine |
Cranberries | (−) tryptophan; (−) methionine+cysteine |
Millet | (−) lysine |
Brazil nuts | (+) methionine+cysteine |
We note that the most unbalanced foods are almost always markedly deficient or in excess in a single IAA. Although the most commonly deficient IAAs are lysine and the sulfur-containing IAAs of methionine and cysteine, there are some unusual exceptions. For example, prepared mustard is nearly devoid of tryptophan, while peeled cucumber is exceptionally low in histidine.
The best overall food pairings observed in this analysis are shown in
|
|
|
Minimize calories | Sesame seed flour | Soy protein isolate |
Seaweed, spirulina | Soy protein isolate | |
Seaweed, laver | Soy protein isolate | |
– | Soy protein isolate | |
Cottonseed flour | Soy protein isolate | |
Sunflower seed flour | Soy protein isolate | |
Seaweed, spirulina | Soy sauce (tamari) | |
Seaweed, spirulina | Watercress, raw | |
Seaweed, spirulina | ** | |
Seaweed, spirulina | Pumpkin leaves | |
-------------------- | -------------------- | -------------------- |
Maximize efficiency | Sweet corn, whole kernel | Tomatoes |
Apple | Coconut meat | |
Sweet corn, whole kernel | Sweet cherries | |
Orange or tangerine juice | Edamame | |
Lambsquarters | Barley malt flour | |
Sweet corn, whole kernel | Wheat germ | |
Dates, Medjool | Edamame | |
Sweet corn, whole kernel | Lotus root | |
Apple | Mustard seed | |
Sweet corn, whole kernel | Peppers, sweet | |
-------------------- | -------------------- | -------------------- |
Minimize weight | Sesame seed flour | Soy protein isolate |
** | Soy protein isolate | |
Cottonseed flour | Soy protein isolate | |
Sunflower seed flour | Soy protein isolate | |
Brazil nuts | Soy protein isolate | |
Seaweed, spirulina | Soy protein isolate | |
Watermelon seed kernels | Soy protein isolate | |
Safflower seed flour | Soy protein isolate | |
Butternuts | Soy protein isolate | |
Peanut flour | Soy protein isolate |
In some cases, single foods (denoted by a ** in the alternate entry) outperformed food pairs. Note these tables are summaries of the larger list on the vProtein website (
The top food pairs that maximize the efficiency of IAA usage (
Examining the top 100 pairings for each food, we found no consistent pattern of food group-food group pairings. This observation was confirmed using a chi-squared test and indicated that no food group pairing was overrepresented compared to a random sampling at a p-value of 0.05 or less. Details of the top pairings by food group and chi-square statistics are provided in
The calculated pairings made by the vProtein algorithm assume that the IAA composition of each food is accurately known and unchanging. However, the nutrient composition of any food is likely to vary depending on the variety, storage and preparation conditions, and growth environment. To assess the variability of each IAA, we examined the standard error of the mean for each IAA measurement on a plant-based food in the USDA sr22 database. Expressed as a percent error, we found the following average error values: histidine 3.3%; tryptophan 2.7%; threonine 3.8%; isoleucine 2.5%; leucine 3.6%; lysine 3.3%; methionine 4.2%; cystine 4.4%; phenylalanine 2.9%; tyrosine 3.4%; valine 1.9%.
To assess the impact of this level of variability, we resampled all of the IAA data using a worst case of 5% error and reran the vProtein analysis. The resulting two-way pairings were nearly identical with only slight changes in ranking and weight calculations for each component (data not shown). This result indicates that although the IAA content of these foods does vary, this variability does not significantly affect the predictions made by the analysis.
Our analyses identified a large number of single and pairs of plant-based foods that satisfy the 2005 DRI IAA pattern. Some of these foods represent historically well-known protein sources such as soy products. In addition, the analysis has uncovered a number of less well-known protein sources, such as the pairings listed in
Note that in the current analysis, vProtein only identifies food pairs based on the measured IAA profile in the food, and as such does not account for the other macronutrient needs a person may have.
Before starting this analysis, we expected historically well-known pairings such as grain/legume to dominate the pairings. In contrast, no bias was observed in the list of top pairings when analyzed by food group. The lack of a statistically significant food group-food group association could be explained in two ways. First, it is possible that the food group labels assigned by the USDA are introducing artifacts into the analysis. For example, dried soybeans are listed as “Legumes and Legume Products,” while fresh soybeans (edamame) are listed as “Vegetables and Vegetable Products.” However, manual inspection of the food pairings reveals that nearly all foods are matched to a wide diversity of foods seemingly independent of possible food group labeling errors.
A second possible reason why no significant food group-food group pairings were found is that food groups are poor predictors of IAA patterns. Traditional protein pairings of legumes and grains are based on the assumption that legumes are generally limited in methionine or cysteine, while grains are limited in lysine
These analyses suggest that pairings of plant-based foods should be based on the individual foods themselves instead of based on a broader food group/food group pairing. By analyzing foods on a case-by-case basis, one can also define the proportion of each food required for completeness. In this analysis, food pairings ranged from less than 1% weight of one food to nearly balanced proportions—depending on the food and the optimization objective.
PCF have been widely explored as a method for supplementing infant and child food sources in resource poor areas. PCFs have the advantage of storability and the ability to prepare these meals one serving at a time
The basis of most PCFs is a micronutrient-fortified mixture of soy and rice. In this context, soy and rice provide low cost and storable sources of both calories and protein. When analyzed in vProtein, the top weight minimizing complements for “Rice flour, white” include “Cereals ready-to-eat, wheat germ, toasted, plain” (95%) followed by a long list of soy protein products. Optimal pairings of rice to soy protein products indicate an optimal ratio from 60-25% rice, depending on the soy product. This pairing result is in agreement with the observed impact of PCFs on improving growth, supporting the validity of the vProtein analysis results.
Interestingly, when the vProtein analysis is repeated to find complements to “Soy flour, defatted” we find that rice does not score well as an optimal weight minimizing complement. Instead the top scoring pairings include dried spirulina (30%), variants of cottonseed flour (15%), various forms of sesame flour (12-8%), brazil nuts (12%), safflower seed meal (17%), watermelon seed (15%), and defatted peanut flour (28%). Interestingly, in the top 100 pairings, rice only appears as rice bran.
The soy pairings identified by vProtein have two possible advantages over the rice flour pairings. First, the soy pairings produce more weight efficient combinations. For example the top scoring rice flour combinations have a minimum total weight of 73 grams to provide 25 grams of high quality protein, while the soy combinations have a minimum total weight of 45 grams to provide 25 grams of high quality protein. The additional weight of the rice complements is mainly due to starch, which may or may not be desirable as a calorie source. Overall the top rice complements provide from 260 to 360 calories, while complements to “Soybeans, mature seeds, raw” provide 180 to 260 calories.
The second advantage of the soy pairings is that they encompass a more diverse space of possible foods. This diverse space provides both flexibility in food sourcing along with a greater potential diversity of other nutrients. As an example, pairing soy to spirulina also introduces a wide variety of micronutrients, thereby reducing the amount and potentially cost of micronutrient supplementation. Field trials with spirulina supplementation to a soy, millet, and peanut mixture have shown a synergistic relationship in rehabilitating undernourished children
We expect that three food and higher order combinations will be able to match the reference IAA pattern with rapidly increasing accuracy. Ideally, the net IAA pattern from a complete meal would be examined because this is the IAA combination that the body experiences. However, exhaustively searching even all 3 way complements of the 1251 plant based foods discussed in this paper yields on the order of 1 billion possible combinations. This large search space is both computationally prohibitive and the resulting combinations are difficult to visualize.
The foods with the most unbalanced IAA patterns listed in
As an example of a possible functional food, the foods based on degermed corn contain a large excess of leucine relative to our needs (
The IAA imbalance in Brazil nuts is also nutritionally interesting because Brazil nuts only contain a significant excess of methionine and cysteine (
Using a quantitative, informatics based approach we are able to systematically explore areas of food space to optimize nutrient patterns in general, and not just for IAA patterns. A similar approach could be used to identify food combinations that contain desirable lipid, carbohydrate, mineral, and vitamin patterns, for example. These quantitative diets may or may not map to historical diets, but would better reflect our physiological needs based on current research.
All analyses were based on a set of IAA measurements for plant-based foods. The IAA measurements were obtained from the U.S. Department of Agriculture National Nutrient Database for Standard Reference, Release 22 (USDA sr22)
Next, from the set of foods with IAA measurements, we identified plant-based foods using the following two steps. First, foods were limited to the following USDA food groups: Vegetables and Vegetable Products; Legumes and Legume Products; Cereal Grains and Pasta; Fruits and Fruit Juices; Nut and Seed Products; Breakfast Cereals; Spices and Herbs; Fats and Oils; and Beverages. We recognize that other plant-based foods exist in the database, such as in the USDA food groups Ethnic Foods, Snacks, and Fast Foods, however upon further inspection we found that few if any of these entries contained IAA measurements. Second, we manually eliminated any animal-based foods that were suspected to contain dairy, eggs, or honey.
As an IAA reference pattern, vProtein uses the pattern in the 2005 dietary reference intake (DRI) published by the Food and Nutrition Board
A comparison of the 2005 DRI pattern, other commonly used reference patterns, and the IAA patterns in foods commonly identified as complete proteins (egg white, whey protein, milk, and chicken breast), demonstrates the similarity between these patterns (
Note that the IAA pattern used in this work is a generally accepted consensus profile, but may not optimally apply to all age groups, environmental conditions, or life stage. As an example, work in 2010 suggests a slightly different IAA pattern for 2 year old children
For a single food, there exists a unique quantity that will provide at least the reference pattern of each IAA. This food quantity is determined by the most deficient IAA in the food—a similar process as is used to calculate the amino acid component of the PCDAAS value. Once the limiting IAA is found, the food quantity is rescaled by the reciprocal of the ratio of this IAA to the reference. This rescaling ensures that the food mass will contain exactly the reference value for the limiting IAA, and an excess of the remaining IAAs. Mathematically, this scaling weight can be expressed as:
Identifying pairs of complementary foods is more complex and requires two types of constraints. The first type of constraint requires that the food mixture contains at least the IAA content defined by the reference pattern. Alone, this constraint reduces the solution space, but does not define a unique weight for each food in the combination. To identify a single solution, we used a second constraint to minimize the total calories, total weight, or maximize efficiency by minimizing the total mass of IAAs beyond the reference pattern. Using these two kinds of constraints generates a standard linear programming problem.
Mathematically, these two groups of constraints can be expressed as the following:
Minimize one of the following:
Note that by minimizing the total weight of the combinations, we bias the optimization toward combinations that are more practical from an eating perspective as they will tend to be smaller. By minimizing the total calories, we bias the search toward combinations with a higher protein to calorie ratio. By maximizing the IAA efficiency, we simultaneously minimize the mass of excess IAAs and minimize the total protein content of the combination while still satisfying the reference IAA pattern.
Optimization was carried out using the convex optimization package CVXOPT version 1.1.2
The relative weighting of all possible food pairings were tested, producing approximately 7.8×105 pairs for each optimization condition.
High scoring food complements were analyzed to see if they mapped to a preferred USDA food group pairing. For example, were foods from the food group “Legumes and Legume Products” more frequently paired with foods from the group “Cereal Grains and Pasta?” The analysis was done separately for pairings identified from minimizing excess, calories, and weight. The list of observed pairs was made up of the top 100 pairs for each food, and then mapped to the food group for each food in the pairing. The resulting food group pairing frequency was then compared to the expected frequency of sampling the pair at random from our list of plant-based foods. This comparison was carried out using a Pearson's chi-square test to obtain p-values for each pair of foods.
The analysis results were formatted for search and display online at the website
(TIFF)
(TIFF)
A sampling of amino acid profiles of some of the particularly unbalanced foods listed in
(TIFF)
Minimum Excess: Overall food group based pairings, chi-square statistics, and top-food matches.
(DOCX)
Minimum Calories: Overall food group based pairings, chi-square statistics, and top-food matches.
(DOCX)
Minimum Weight: Overall food group based pairings, chi-square statistics, and top-food matches.
(DOCX)