Hierarchical cognitive models of printed word processing (e.g., Dehaene, 2005) build increasingly abstract and compact representations. We show that a connectionist model based on Cascade-Correlation (Cascor) can perform such useful abstractions. We used a simple yet realistic task consisting in learning word position independence as a modified encoder task where inputs are presented in different locations. Cascor successfully encoded input patterns onto a smaller set of hidden units. When trained on frequent words, Cascor generalized to other, unseen words, but not to random strings of characters. Generalization to known words in unseen positions improves as more words as presented. These results suggest Cascor learned simultaneously regularities related to word structure (i.e., that certain letters never occur in certain positions, such as the letter Q at the end of words) and position (location invariance). We also find the representations built are compatible with the concept of open bigrams: networks make the least error on contiguous, forward strings of two letters (bigrams), marginally more on non-contiguous forward bigrams, and significantly more on backward or reversed bigrams.