Newport's "less is more" account of critical period effects in language acquisition---that young children may be aided rather than hindered by limited cognitive resources---has received computational support from Elman's demonstrations that effective learning of English-like artificial grammars by recurrent connectionist networks performing word prediction depends on "starting small" (e.g., starting with limited memory that only gradually improves). The current talk presents connectionist simulations that indicate, to the contrary, that language learning by recurrent networks does not depend on starting small; in fact, such restrictions hinder acquisition as the languages are made more English-like by introducing graded semantic constraints. Such networks can nonetheless exhibit apparent critical-period effects as a result of the entrenchment of representations learned in performing other tasks, including other languages. An extension of the approach demonstrates how seemingly categorical grammaticality distinctions, such as constraints on "want to"/"wanna" contraction, can be induced by sensitivity to the graded statistical structure of the full language. Finally, although the word prediction task may appear unrelated to actual language processing, a preliminary large-scale simulation illustrates how performing implicit prediction during sentence comprehension can provide indirect training for sentence production. The results suggest that language learning may succeed in the absence of innate maturational constraints or explicit negative evidence by taking advantage of the indirect negative evidence that is available in performing online implicit prediction.