I perfectly agree with you that language is innate (whether there is an innate relationship between the language faculty and the hearing / speaking ability is another issue) but this does not mean we have a universal phonological structure in our head. I honestly don't know what we have in our head, but I am convinced that most if not all of phonological 'universals' can nicely be explained on a purely phonetic ground. What is innate is the perfection of our ability to adapt ourselves to our phonatory organs.
I don't think "universal phonology" is all encoded in our brains in symbolic form. If by postulating explanation on purely phonetic grounds you mean that our phonological preferences depend e.g. on the anatomy and innervation of the vocal tract (which ARE genomically determined and somewhat variable structures), or the logically inevitable physical constraints on effective communication by acoustic means, I agree with you on that point. Did you hear of that recent experiment in which young ferrets' auditory and visual nerves were "switched round" so that the auditory cortex had to learn to see -- and it did? If the communication system of some early hominids was gestural, it was chiefly the visual cortex that was engaged in understanding "linguistic" signals -- and the same happens when we learn to read, or can't hear and communicate using sign language (which also has an analogue of phonological structure!). Given the versatility of human behaviour all learning is certainly data-driven and cannot rely on parameter-setting in predetermined patterns. Which doesn't mean that there are NO innate guidelines for pattern-recognition, also in phonology.
>How would _you_ account for the total absence of clicks from languages outside the African "Clickland"?

That is an interesting issue -- I would attribute this to coincidence. Bantu languages in contact with Khoi-san quickly acquired those clicks. If history had been different... maybe the entire world would click !
True, but remember that the Nguni Bantu languages have very small click inventories as compared with Khoisan. If I remember aright, Xhosa has nine click phonemes, while in a Khoisan language a system with thirty clicks would count as modest-sized!
The issue is -- why aren't clicks a part of the basic segmental inventory of all languages if they are perceptually so easy to distinguish ? Pronouncing a click on its own is easy, but to accompany it with a vowel is more difficult articulatorily ; using clicks as consonnants is more difficult than using them as simple sounds  in communication.
You could say the same of other phonation types, especially ejectives and implosives, which are however much more common than clicks and can be found scattered here and there all over the globe.
I have an idle proposal for the clicks in Khoi-san languages (that is not testable but still nice to think about). Language is more ancient than the homo sapiens (maybe even more than the homo genus), it was independently invented progressively by different species of hominids. The modern human languages are polygenetic and are descendent from languages invented by different species of men. One of those proto-languages used clicks and handed it down to now because clicks are, as you observed, extremely resistant to sound change. The others did not. They didn't developp those clicks either because, well, they did not think about it. Clicks already had a use as a sound (like whistling). Their languages were already developped, so they  couln't have introduced an entire series of new features in them. Introducing clicks in your language is not possible without an explicit reason like borrowing. Click cannot come from phonetic changes in normal consonnants.
I agree languages are most likely HIGHLY polygenetic, though on the other hand I don't think the family-tree metaphor has much validity for the early stages of linguistic evolution, when languages probably formed entangled networks rather than neat families. A split that existed, say, 200 kyr ago would have been obliterated not much later owing to areal convergence. If clicks have been used in southern Africa since time out of mind, it appears all the more likely that some kind of barrier has prevented them from spreading farther north and out of Africa.