Ideas: Korean Puzzles

Saturday, May 10, 2014

Korean Puzzles

I've just spent a pleasant few days in Seoul and I have two questions:

1. Why is it that a majority of the population shares a single last name (Kim)?

2. There is no place in Seoul where one is more than a few hundred feet from someone selling food (casual observation, not scholarly claim). Diet soft drinks are considerably less common in Korea than in the U.S. Yet almost nobody is fat. Why?

45 comments:

machete said...: Kim or Lee or Park probably accounts for over half of all Koreans. Throw in Choi, Chung/Yung, and Shin and you're maybe looking at 2/3rds.

My guess is it has to do with how culturally insular they are, and their feudal history.; 1:57 AM, May 10, 2014
Tibor said...: My guess is that in Korea there is significantly more social pressure on being thin than in the US or Europe. Asian parents in general seem to have no problem to call their daughter fat and press her to lose weight. What also supports it is the common occurence of plastic operations (at least as far as what I read about Korea from people who lived there for a while is correct). It seems to be commonplace for a schoolgirl (and even boys) to wish for a plastic operation as a high school graduation present from their parents (particularly the eyelids are often modified to look more European). Korean society simply seems to put more stress on looks in general.

I am also reminded of one conversation with a girl from Hong Kong (which is of course a different country, but at least to me it seems the culture is similar) who said that good looks require hard work as well and so it should be admired as much or almost as much as other qualities and skills. If that is a common point of view in that society, then it makes sense that people try to look as good as they can (even more so than in the West), since not looking great may be associated with being lazy or not caring about yourself.; 4:36 AM, May 10, 2014
David Friedman said...: I was told by a Korean that more than half of Koreans were named Kim, but an online source claims only that it's more than a fifth.; 4:40 AM, May 10, 2014
Bob Smith said...: Genetics. Asian populations are not only thinner than whites, but they're more likely to become diabetic.; 4:48 AM, May 10, 2014
Anonymous said...: those last names are so called "noble" last names. And if you were a noble you could have many children through many wives. On top of this, people with working class last names could purchase a noble last name, further increasing the concentration of noble last names.; 6:56 AM, May 10, 2014
Sal said...: Cuz they love kim chi; 9:10 AM, May 10, 2014
Sal said...: "Yet almost nobody is fat. Why?"

Korean food is water based than fat based.

It's bigger on salt and hot pepper than on sugar.

It taste so bad, people don't eat much.

Fatsos are openly ostracized.; 9:12 AM, May 10, 2014
Bruce said...: Everyone I know laughs at me for drinking diet soda, and they are all skinnier than me. Art de Vany thinks diet drinks fool your body into thinking you are eating sweets, and are therefore less effective than a simple calory count would suggest.; 4:22 PM, May 10, 2014
The Laconic said...: Korean food is water based than fat based.

If dietary fat makes you fat, that's news to me.; 7:34 PM, May 10, 2014
kylind said...: I think the diversity of surnames in different cultures has to have something to do with the time since last names came into use.
In East Asia, family names have been in use for a long time, so there was a lot of time for names to die out. If there are no children, or only daughters in one generation, that name will die out. (Assuming people don't change their names to create new ones.)

In Europe, many people (for instance many Dutch and also many European Jews) adopted surnames as late as the 19th century. So there hasn't been nearly as much time for names to die out.; 2:06 AM, May 11, 2014
Tibor said...: Kylind: This is an interesting point. For a family name to survive indefinitely (assuming some approximative conditions such as that the number of male members of a generation is distributed indentically and among all generations and independent of all other generations) you need the expected number of male members with that family name in a generation to be higher than one. If you start with a country where each immediate family has its own unique family name, this is probably not true. But as some of the names die out and others grow, this will eventually be satisfied and you should reach a stable distribution of the number of family names eventually (assuming no unlikely but significant events such as people with certain surnames - say jewish - being systematically killed out), at least if the population size also reaches a certain stable size eventually. It could be interesting to ask what this number is.; 4:10 AM, May 11, 2014
Daniel Lucraft said...: Daniel Tudor discusses the surname question in Korea: The Impossible Country, which is an excellent recent book on South Korea.

From my notes, in the time of the Three Kingdoms surnames were a privilege bestowed by the king. (On the other hand the king would sometimes force someone to take a surname like Pig or Cow as a punishment :)) Only the yangban ruling caste had surnames, and only the yangban could enter the civil service exams, among other privileges. By the time of the Joseon dynasty, it was possible to buy surnames from prestigious families. Many many middle class Koreans took advantage of this by purchasing the Kim, Lee and Park surnames, which now account for about 45% of the population.; 4:32 AM, May 11, 2014
Jonathan said...: "Diet soft drinks"? I don't drink such things. The soft drinks I consume are mainly water, unsweetened fruit juice, and milk, in descending order by volume. I've never been keen on Coca-Cola and suchlike stuff.; 5:35 AM, May 11, 2014
Handle said...: In Japan, one is never more than a few hundred feet from a drinks vending machine, with many varieties of hot and cold coffees and teas (and yes, a maximum of one diet soda).

But one much more rarely sees an America-style junk food snacks vending machine. Instead, you go to one of the countless and ubiquitous convenience stores if you don't want to go to a restaurant.; 8:27 AM, May 11, 2014
Andy Weintraub said...: It's just possible that it's the other way around: There's not much demand for diet soda because people are not fat.; 7:01 PM, May 11, 2014
Unknown said...: This comment has been removed by the author.; 8:05 PM, May 11, 2014
Anonymous said...: Most people in Korea have Kim or Lee as their last name because only aristocrats(there were only few aristocrat families) used to have proper names, and when finally 'caste system' was gotten rid off, people of lower class started to use the names of higher class as they please for reasons such as: pretending to be from noble family; borrowing one's masters name; or just having no better idea or example for a name for them to take.

The reason many Korean are thin despite all those non-diet sodas they sell everywhere is because of food one eats at home. Korean home-made cooking is exceptionally low-fat and healthy. Mothers cook such food and kids who get to eat all those vegetables seek sth with sugar and fat when they are not at home. Nevertheless, since most Koreans food is based on tofu and rice, it isn't that easy to get really fat even with all those junk foods one occassionally enjoys.; 8:10 PM, May 11, 2014
Glen said...: The fact that there's a very high density of people selling food suggests a walking rather than a driving culture. Which means a physically active population.

Consider NYC. Most of Manhattan has a very high density of food vendors - you can't walk far midtown without passing a hot dog cart, gyro cart, or tiny corner grocery/deli. If mere casual access to food while walking made people fat, manhattan would be full of fat people. But it isn't; instead it's full of people who walk everywhere they go and in the process often take subways that require navigating lots of stairs. (I'm not saying that the walking *makes* people fat, rather that it *precludes* them being *too* fat. People who get too fat or otherwise unfit are less likely to be seen walking around downtown.)

In the US, you tend to find the heavier people out in the suburbs where everyone drives everywhere to run errands. (If you have a car and live someplace with lots of parking, the fact that the nearest food is a mile away is not much obstacle to eating it.); 1:15 AM, May 12, 2014
Power Child said...: Lots of Koreans smoke. Nicotine is an appetite suppressant.

I don't know much about how East Asians' genes cause them to metabolize fats, but it seems plausible that genes play a not insignificant role in their average BMIs.

(By contrast, some plurality of white America is descended from southern Germans and northern Italians, both of whom tend to be portly and barrel-chested; much of black America is descended from west Africans, who tend to be relatively squat and stout compared to East or Sub-Saharan Africans; much of Hispanic America is also naturally squat and stout. And in the skinniest parts of America, you find the most people from other genetic backgrounds besides these.); 6:21 AM, May 12, 2014
Anonymous said...: I'd assume that there would be little demand for diet products among a population where "almost nobody is fat."; 6:35 AM, May 12, 2014
Biopolitical said...: Tibor Mach: There is no stable frequency distribution of family names unless new ones are regularly created or having a rare surname makes people more fecund. Otherwise, kylind is right that all surnames stochastically drift to extinction until a single one remains.; 9:35 AM, May 12, 2014
Tibor said...: Biopolitical: I partially disagree. If we have a constant population size (which was indeed my assumption - a useless and a wrong one at that), then you are right. We have a Markov chain on a finite state space of all family name size configurations, so all states will be reached and eventually we end up with just one name. My mistake was that I did not take into account that you cannot keep the population size while simultaneously not "stealing" family names from others. Thanks for pointing that out. If the population size can grow however, there can be a (nontrivial) stable distribution of the name frquencies.; 1:13 PM, May 12, 2014
Biopolitical said...: Tibor Mach: All surnames but one are eventually lost even in a growing population, as long as there exists in each generation a fraction of the population that leaves no descendants (or more precisely surname-giving descendants - males in some countries). Of course, the exceptions I mentioned in my previous comment also apply to a growing population.; 1:58 AM, May 13, 2014
Tibor said...: Biopolitical: I don't see where there is a problem. If I assume that the expected number of descendants of a family name bearer is above one (which was at odds with the constant population size, but possible without that constraint), I have a positive probability for that family name never to die out. Of course, that does not mean some branches of that family tree won't die out (even conditioning on the family name to survive), but that is not a problem.; 8:26 AM, May 13, 2014
Biopolitical said...: Tibor Mach: You must focus not on the expected number of offspring per parent but on the fact that the number of offspring per parent is a random variable. As a result, surnames in each generation are a random sample of those in the previous generation. This random sampling ensures that surnames go extinct until only one remains. The expected time to extinction depends on such things as population size, the variance of offspring number per parent and so on. For example, in a large population with little variance of offspring number, surnames go extinct at a relatively slow pace.; 4:09 PM, May 13, 2014
Tibor said...: Biopolitical: That is not true. A family name (number of the name-bearers in each generation) is a simple Galton-Watson branching process (if we simplify a bit and assume that each name-bearer's number of offspring - which is indeed a random variable - is independent of the same value of others - which is another thing that the constant population size would not allow by the way - and identically distributed). As long as that holds and as long as the expected number of offspring of each name-bearer is above 1, there is a positive probability of non-extinction ever. It also means the population size eventually explodes to infinity (on the event that it does not extinct)...so it is not a very realistic model, that's true :); 6:25 AM, May 14, 2014
Unknown said...: http://www.reddit.com/r/AskHistorians/comments/25jz1p/david_friedman_has_a_question_about_china_vs_korea/; 10:48 AM, May 14, 2014
Biopolitical said...: Of course "there is a positive probability of non-extinction ever," but this posibility is realized in only one surname.; 3:33 PM, May 14, 2014
Alex Nowrasteh said...: "In Korean culture, the names of family members are recorded in special family books. This makes it possible to follow the distribution of Korean family names far back in history. It is shown here that these name distributions are well described by a simple null model, the random group formation (RGF) model. This model makes it possible to predict how the name distributions change and these predictions are shown to be borne out. In particular, the RGF model predicts that for married women entering a collection of family books in a certain year, the occurrence of the most common family name 'Kim' should be directly proportional to the total number of married women with the same proportionality constant for all the years. This prediction is also borne out to a high degree. We speculate that it reflects some inherent social stability in the Korean culture. In addition, we obtain an estimate of the total population of the Korean culture down to the year 500 AD, based on the RGF model, and find about ten thousand Kims."

http://iopscience.iop.org/1367-2630/13/7/073036/fulltext/; 9:02 AM, May 15, 2014
Tibor said...: Biopolitical: About this I am not exactly sure, to be honest. But I don't see why that is obvious, at least. If I have two (or more) family names and no constraints on the population size, they can grow simply by 'inbreeding'. Without the constant population size, the state space is no longer finite and the 'trap' (or absorbing) states of the chain never have to be reached at all with a positive probability.

Maybe I am missing something, but please point out what it is exactly if that is the case. The situation is obviously complicated by the fact that the 1-dimensional chains (i.e. those that keep track of one family name) are not independent and the exact properties would likely depend on the exact nature of interaction. But, as I said, maybe you're right, but it does not seem to me as quite obvious. I'd expect your statement to be true only if the chances of 'interbreeding' are high enough, where "high enough" is something one would have to specify quite carefully. My initial guess is something like this - if there is enough interbreeding (in terms of the number of offspring) to on average offset the members 'lost' due to change of their name, then there still will be a positive chance for the two (or more in a more complicated setting) family names to survive indefinitely (while of course, their respective sizes are going to tend to infinity).; 10:36 AM, May 15, 2014
Joey said...: This comment has been removed by the author.; 2:27 PM, May 15, 2014
Joey said...: Tibor,

I'd expect your statement to be true only if the chances of 'interbreeding' are high enough, where "high enough" is something one would have to specify quite carefully.

It sounded like Biopolitical was assuming random mating.

The model would look something like this:

Begin with a 50:50 proportion of surnames A and B, totaling N in size. Take some percentage more than N random trials from the population (with replacement) to make a new larger population with N_1 elements. Repeat with the same percentage more than N_1 trials...

This model assumes that the variance in individual mating patterns is small relative to the population size.

My intuition agrees with biopolitical: the relative frequency of one surname would diminish to zero even in a growing population.

I don't know whether the growth rate of the population affects the half-life of the smaller surname in this model...

Reposted to correct a formatting error.; 2:30 PM, May 15, 2014
Biopolitical said...: Tibor Mach: Inbreeding has no effect on the extinction of all surnames but one. People could as well be asexual.

Posing an infinite human population or a population where every person that is ever born passes his or her surname to at least one offspring may be mathematically interesting but has no bearing on surname dynamics in real human populations. There is a finite number of humans and there is randomness in reproduction and thus all surnames go extinct except one (unless rare surnames confer a reproductive advantage to their bearers).; 2:12 AM, May 16, 2014
Tibor said...: Biopolitical: I did not say that everyone passes their name on at least one offspring, but that this happens on average (or that the expected number of offspring is at least one which is the same thing). That obviously implies the population to go to infinity eventually.

You're right that inbreeding doesn't matter though, because I was only talking about the average number of name-bearing offspring and as long as we don't complicate things with the fact that a woman cannot bear 50 or so children in her life, the reproduction could really be asexual. But if we don't complicate it that way, then the situation is actually as clear as I though originally, we only have to omit the assumption of constant population size. As long as the expected number of namebearing offspring is above one, the population can survive with a positive probability and this is true no matter how many other populations are also present. They can only take the women away, but they do not carry the name over, so in this simplified model, they won't change the outcome.

By the way, if we limit the population size to a constant, we will eventually see an extinction of the entire population, again, regardless of the number of distinct names (but the less populous or fertile are of course more likely to die out first). At least as long as there is a positive chance for an individual to have no offspring.

But you're right, none of these models are very accurate or interesting. There are more complex branching processes which allow for both non-extinction and a limited size population which is much more realistic, but those are quite advanced and I don't know that much about those (currently, I mostly deal with the opposite - coalescent processes that track the genealogies of genes, populations or just some other particles with similar properties back in time). But it could be fun to use those and it could actually lead to some meaningful results.; 4:32 AM, May 16, 2014
Tibor said...: Joseph, you wrote:

"Begin with a 50:50 proportion of surnames A and B, totaling N in size. Take some percentage more than N random trials from the population (with replacement) to make a new larger population with N_1 elements. Repeat with the same percentage more than N_1 trials..."

I'm not sure what you mean by:

"Take some percentage more than N random trials from the population"

Could you please rephrase it?

Thanks.

By the way, "my" model is this:

For each family name I have a X_n as the number of its name-bearing members (lets just continue addressing them as men as it is usually men who pass on the name). Now X_(n+1) can be expressed as a sum from 1 to X_n of I^n_k, where I^n_k is the number of sons of the k-th individual of the n-th generation. Suppose that the I^n_k are independent and identically distributed with a distribution that has an expected (/mean/average) value above 1. Then there is a positive chance of surviving forever (in which event the population explodes).

The other family names are not a problem. The only thing they can do is that they can "steal" the daughters. But that is fine, since daughters do not contribute to the spread of the name directly (of course, they are needed to produce more sons, but it doesn't matter what their name is). As I said, introducing more realistic constraints such as "a woman can only bear at most N children" will change the model quite a bit, since currently they can be simply ignored (as Biopolitical pointed out), since theoretically one woman can be a mother to the entire next generation. If they are not ignored, then the model as I described it breaks down entirely and another has to be used to describe the population...then I would go with my guess about the intensity of interbreeding between the names, but that is really just a hunch.; 4:55 AM, May 16, 2014
Joey said...: I'm not sure what you mean by:

"Take some percentage more than N random trials from the population"

Could you please rephrase it?

You bet.

I meant, if you want to assume a growth rate of 5% each generation, take 1,050 random trials from the population of size 1,000. That would be your new generation, from which you would take another 1.05*(1,050) for the next generation.

Does that create the dynamic we are looking for?

I haven't read your comment in full yet, but I'll come back later.; 10:26 AM, May 17, 2014
Joey said...: Tibor,

The other family names are not a problem. The only thing they can do is that they can "steal" the daughters. But that is fine, since daughters do not contribute to the spread of the name directly

I think this underlies the disagreement. If the larger population by chance "steals" daughters from the smaller population, they haven't only increased their own next-gen size; they have reduced the growth rate of the smaller population. The extent to which the subpopulations do this accounts for the variability in reproductive success, which underlies Biopolitical's argument.

hence, Biopolitical's comment:

Posing...a population where every person that is ever born passes his or her surname to at least one offspring...has no bearing on surname dynamics in real human populations

In other words, if every male always produces, say, two males, then of course no surnames will go extinct. The variance in offspring produced per male would be 0, so no "stealing" occurs.

The variability in reproductive success allows the combined growth rate to still be above 1, while the growth rate of the smaller surname is smaller than 1. Your model doesn't allow for population-level variability in reproductive success... every male in subpopulation A who does not reproduce has "given" his reproductive power to a different male with the same surname. Thus, the variance between subpopulations A and B is zero, which means that every male in subpop A and B might as well automatically always have >1 male offspring.; 11:01 AM, May 17, 2014
Anonymous said...: Weight is a function of total calorie intake to the first order. It doesn't matter if you're surrounded by food if you don't choose to eat too much of it.; 1:41 AM, May 18, 2014
Rex Little said...: I don't know if having a small number of different surnames is an Asian thing in general, but Vietnam is certainly another example. I don't have actual data, but Nguyen has got to be as common there as Kim is in Korea. Add in Pham and a couple of others, and you've got most of the country covered.; 10:27 PM, May 18, 2014
Tibor said...: Joseph: Yes, but then I got a bit confused by Biopolitical's statement that the population may as well reproduce asexually. That is the case in "my" model, but then his conclusion is not right. In your model, it is.

However the second comment is not true. The expected number of offspring above 1 does not mean that each man has at least 2 sons or even that in each generation the average - that is the actual average in the realization of the random variable which is the number of offspring - is at least 2N, where N is the population size of one surname). The fact that a random variable has an expectation above 1, is not the same as it being above 1 almost surely. It can still be zero, that being an unlikely, but possible event. That is why there is only a positive chance of nonextinction if you meet the expectancy condition, but the probability is not necessarily 1 (100%).

So it is not true that in my model in each generation a man with no children just "gives his children" to a different man with the same surname. There can be generations in which the size of the surname population decreases. The 1 expected son is important though. Because if the that expectancy is below (or equal to...at least as long as it is not a degenerate distribution which really means that everyone always has one son) 1, then there is zero chance of survival (forever), if it is above one, there is a positive chance.

Also note, that you cannot have the numbers of sons i.i.d. (independent and identically distributed) in one surname if you assume what you thought I do - that is that the realization of the total number of offspring is always above 2N. In that case, the numbers of sons of each man are highly correlated (and so are not indepenedent).; 7:01 AM, May 20, 2014
Tom Grey said...: Art de Vany thinks diet drinks fool your body into thinking you are eating sweets, and are therefore less effective than a simple calory count would suggest.

I'm pretty sure that as science gets better, it will find out this is somewhat true. Remember the idea of sweet tasting /no calories is to fool the body -- without considering that the body may have some natural "more fat storage" defense to ... being fooled again.

A similar conclusion might also come from gut bacteria that is "fooled" into becoming far more efficient at producing fat to compensate for being fooled by low calorie sweetness.

I advise replacing sweet soda / cola with real 100% juice, even tho it costs more; plus glasses of water.; 6:00 AM, May 27, 2014
Eric Rasmusen said...: The extinction thread is interesting. I don't really understand it. Is this the idea? We start with 100 families with 100 names. Each one has exactly 4 children, with a random distribution of boys and girls. This means that over time, the surviving surnames are held by more and more people, so the chances of a generation with all girls gets smaller and smaller. Maybe 4 is not the number, but there is some number such that there is positive probability that we won't end up with just one surname.

This sounds like gambler's ruin. In the 4-child model I just described, do we escape probability-one human extinction (that is, as T increases, does the probability that all human babies are girls fall to 0 fast enough?); 2:30 PM, May 27, 2014
Eric Rasmusen said...: Getting back to Korean names: My wife Helen's maiden name was Choi. It was an unusual Choi, though, of Chinese origin. "Choi" is unique in roman or Korean alphabets, but not in the Chinese characters people would use more formally. Maybe the same is true of Kim.; 2:32 PM, May 27, 2014
Tibor said...: Eric: By T you mean the population size? So you'd like to know when there is a probability 1 that you stay with 4 names forever? Well, that probability is never 1, at least as long as there is a positive chance for a generation of a name to be consisting of just girls or just boys who never have children or change their name or something like that. The probability of survival of a name is 1 - sum of probabilities of extinction in each generation. And that sum is zero if and only if all of those probabilities are zero. It is then positive if those probabilities go to zero fast enough, as you said, so that the sum is less than 1.

What could complicate the scenario are the entanglements with the other names. A male member of the population cannot just have a child by himself and the number of children of a female member is limited (not in my scenario above which makes things much simpler but also much less realistic). So the chances of reproduction of an individual do not depend just on the previous state of the process (i.e. how many other men and women share his name) but also on the "competing" names. So the whole process (X1,X2,...,X100) which keeps track of all 100 name populations is a Markov chain, but the individual processes are only Markov with respect to the distribution conditional on all the other name populations, not on its own.
And because of this, the actual properties in such a case are much more complicated (which means I don't see the answer immediatelly and it could possibly be quite non-trivial).

You're right that if we (over?)simplify like I did above, then the survival of a name population can be described simply as a gambler's ruin. And you don't even need the condition on 4 children per individual for that. However, even if you increase the minimum number of children arbitrarily (at least as long as you keep it finite :D ), you still have a positive chance in a generation for all newborns to be girls and so the probability of nonextinction is never really 1. And if that number is 1 or below, the extniction is certain eventually.; 2:51 AM, May 29, 2014
JWO said...: (income)/(Food costs) is a smaller number.; 2:53 PM, June 16, 2014