Neural networks are a popular machine learning tool, particularly in applications such as the prediction of protein secondary structure. However, overfitting poses an obstacle to their effective use for this and other problems. Due to the large number of parameters in a typical neural network, one may obtain a network fit that perfectly predicts the learning data yet fails to generalize to other data sets. One way of reducing the size of the parameter space is to alter the network topology so that some edges are removed; however, it is often not immediately apparent which edges should be eliminated. We propose a data-adaptive method of selecting an optimal network architecture using the Deletion/Substitution/Addition algorithm introduced in Sinisi and van der Laan (2004) and Molinaro and van der Laan (2004). Results of this approach in the regression case are presented on two simulated data sets and the diabetes data of Efron et al. (2002).