Danbooru

Tag Alias: shinryaku!_ikamusume -> shinryaku!_ika_musume

Posted under General

On the other hand, nonce compound words formed by following an established word's pattern make more sense than ones that use a new pattern. If I say squidgirl in English, you know what I mean, and it's not like it's ungrammatical.

Sure, you're right. This is not really an objective thing. But generally noun compounds are not jammed together into a word until the compound takes on a separate meaning of its own, in some sense. With "nekomusume", we have various mythological connotations which don't exist in this case.

Of course, English is flexible when it comes to such things, but Japanese doesn't even have spacing in the first place. So what are we supposed to do when we romanize Japanese?

Well, I kind of hate to even mention this, but ANN apparently seems to know.

0xCCBA696 said:
Well, I kind of hate to even mention this, but ANN apparently seems to know.

Repeat after me: ANN is not an authority on romanisation. ANN is not an authority on romanisation. ANN is not an authority on romanisation.

And ikamusume is the same as a hyphen in squid-girl would be. In German you'd spell it Kalmarmädchen, as a single word. It's not two separate concepts of ika and musume. It's a single concept derived from both, and both Japanese and Germanic languages express this kind of derivation by simple juxtaposition. Thus it's a single word, not two.

Yo Hazuki. This has nothing to do with romanization. ;) It also has nothing to do with German (wtf?). Orthographical words and lexical words are different concepts, by the by. Not sure if you're aware of that. In English we tend to try to model the former on the latter, and in other languages such as German, they don't. (EDIT: well, not really that much; but English tends to err on the side of writing single lexemes as multiple orthographic words, whereas German often writes multiple lexemes as a single orthographic word.) Also please don't talk about "concepts" when all we're discussing about is how to write something. It really doesn't matter what it means - what matters is whether it's one word or a phrase.

Soljashy said:
stuff about ANN

Yeah, which is why I said I hated to mention it. :/ Just to be clear, I do have a fairly good reason for my proposition, which has nothing to do with the idiots at ANN who write reviews about 'Space and Wolf'" and how its English voice actors 'annunciate' their lines.

Updated

0xCCBA696 said:
But generally noun compounds are not jammed together into a word until the compound takes on a separate meaning of its own, in some sense.

Could you provide justification for that assertion? English is pretty much a hodgepodge when it comes to compound words, we have "ice cream", "ice-cold", and "icepick".

The only hard and fast rule I know of for compounds in English, (and many other languages) is that the word's part of speech and basic meaning are determined by the final morpheme (a "cowboy" is a type of "boy").

Orthographically speaking though, English is all over the place, and I know of no general rules for determining the proper orthographic convention of a particular word. The largest number of compounds however have no space between the morphemes.

Also if you look at similar conventions for "____girl" or "____man" in English, we have Superman, Batgirl, fireman, schoolgirl etc. I don't see why "squidgirl" would be a sensible exception, or why mythology would have any bearing on the word's orthography.

Updated

ice cream and icepick are both lexical words, since their meanings are not determined by their constituent parts alone, and furthermore because they are the most "basic" form in their lexeme (i.e. "ice creamy", "icepick-like" are not basic words in the lexical sense). 猫娘 is similarly a lexical word in Japanese, because its meaning is not determined solely by its constituent parts, 猫 and 娘. Originally, of course, it was, until the compound 猫娘 took on a semantic life of its own and developed further connotations. イカ娘 however seems to be a compound to be taken at face value - a girl who is also a squid. A squid girl.

Of course, an icepick is just a pick used on ice, but it does have some specific properties other than that which set it apart. Merriam-Webster defines it as "a hand tool ending in a spike for chipping ice", for example.

Generally, a vague definition for what a lexical word is would be whether or not there is a separate entry for it in an average dictionary. As I said, it's not a really objective thing, but the Japanese wikipedia does have an article on 猫娘 and not one on イカ娘 (other than a redirect to an article about this very manga).

And sorry, what I said above about English is actually very wrong (as your examples prove) - we don't tend to give much more of a shit than German speakers about writing single lexical words as single orthographic words. Not sure what I was thinking there. Edited for correctness.

Updated

I'm still not following you as to how "catgirl" and "squidgirl" are different in such a way that they should be treated differently orthographically.

You say that "catgirl" is a lexical word because it might appear in a dictionary, and therefore is privileged compared to other compound words. I would strongly argue that such an argument amounts to prescriptive linguistics. Using Wikipedia as that end-all-be-all dictionary is also somewhat troubling.

So long as a word has a clear semantic meaning (as the Wikipedia article you pointed to states "...can be generally understood to convey a single meaning..."), that word is a lexical item regardless of whether it is defined in any dictionary or not. This is true even of nonce words.

Furthermore, I'd argue that virtually all compound words are lexical items, since they are almost always created to provide a name for a specific semantic concept that previously hadn't had a name. There is more meaning to "squidgirl" than "a girl who is also a squid", it also implies a mostly humanoid anthropomorphic blend of squid and female human. If the word didn't have these hidden meanings, it could easily be interpreted as simply a female squid.

Finally, I don't see how being a lexical item (as opposed to what?; "squid girl" isn't a grammatical noun phrase if it's simply two disparate consecutive nouns), has any bearing on orthographic convention. As I noted and you agreed with, English simply doesn't have conventions when it comes to the separation, hyphenation, or concatenation of compound word morphemes. Probabilistically speaking, concatenation is the most common choice.

Updated

I'll address your last point first. English makes no attempt to coordinate orthographic word boundaries with lexical word boundaries, nor does any other language that visually separates words orthographically with spaces and the like -- even if they did, the spoken and written languages would quickly diverge anyway (as is the case with "ice cream", for example). That doesn't mean we shouldn't try to have some sort of a sensible basis for how we write things.

Since romanization of Japanese (and the associated typographical conversions such as adding of spaces, etc.) is a very shallow transference, it would be meet to conduct the process in such a way that the rules of conversion are as simple and direct as possible. Incidentally, that's why I prefer 日本式 romanization, since I think phonology is more basic than phonetics (see forum #31328). In the case of spacing, this maxim would lead to us romanizing Japanese without any spacing at all, but that's just too hard to read, so we have to break it up somehow. The easiest way, IMO, is to go by dictionary, since that's something prescriptively determinable. Any attested word can generally be decomposed into these prescribed "lexical words" in some fashion.

In response to your parenthetical: no, a juxtaposition of two nouns certainly can be a perfectly grammatical noun phrase. For example, "squid tattoo" is obviously not something people would consider an established single concept. It's simply a tattoo of a squid. Alternatively it could be interpreted as a tattoo on a squid. There's no single set meaning for this compound.

Shinjidude said:
There is more meaning to "squidgirl" than "a girl who is also a squid", it also implies a mostly humanoid anthropomorphic blend of squid and female human.

Fair enough, and I can kind of see that there is a trend (at least on danbooru and similar circles) to treat "Xgirl" compounds in such a way. Now, if this is the case for "X娘" in Japanese, then I'd say it should be written as "ikamusume". I'm not convinced of that, though. Google isn't helping, just returning results related to this particular character in this particular manga.

Actually, come to think of it, we also have mecha musume. As a precedent, it argues for ika musume, but it also provides another structure following the "X娘" pattern, and does indeed depict anthropomorphized mecha. So your argument seems to have some merit.

Miscellaneous counterpoint postscript (oh look, a compound noun phrase! :P): 1) Lexical items exist in the lexicon, and dictionaries aim to catalogue the lexicon. A lexical item may not actually appear in a dictionary, and indeed the question of whether it would, hypothetically, be included in a sufficiently comprehensive dictionary is not a question that can be answered objectively. 2) Prescriptive linguistics has its place, so please don't treat it as some sort of taboo. 3) Wikipedia is probably one of the most comprehensive quasi-lexicons we have at this point, at least when cataloguing noun phrases.

I'll concede on the "N N -> NP" thing, I guess that does occur more regularly than I had been considering.

While prescriptive linguistics can have it's place (to be honest we use it all the time when setting our guidelines). I don't think it in itself is sufficient to call things ungrammatical, or in this case not even words. Elsewise things like "To boldly go..." or "...to put up with." would be strictly incorrect despite being used and understood perfectly by fluent speakers.

While Wikipedia is a somewhat comprehensive encyclopedia of human knowledge it's not strictly speaking a comprehensive lexicon. Many concepts are not seen fit for inclusion, most of all slang terms or terms with very limited scope. Zokugo-dict or urban dictionary can help expand on it, but even then nonce words would be excluded. These sources can certainly provide positive evidence of a word or term's use, but the inverse is not true. Just because something is not on Wikipedia doesn't mean it is not a word.

Back to the point, mecha_musume does provide a counter-example to my argument. However the number of lexemes and also the number of instances of use of the spaced pattern still indicates that it is by far a minority case.

Updated

How many other "X娘" lexemes are there, and how many of them have been romanized as "Xmusume" as opposed to "X musume"? Anyway, I actually saw mecha_musume, the concept, as being supportive of your argument that "X娘" constructs form single lexemes, though our spelling of it as a tag disagrees with that.

And yes, I don't claim that something having a Wikipedia article is anywhere close to the same as it being a lexeme. However, the criteria イカ娘 would need to have in order to be a lexeme are similar enough to those that 猫娘, an almost identical compound (animal + 娘), has which qualify it as a lexeme. So I reasoned that in this particular case, that Wikipedia has an article on 猫娘 but not on イカ娘 might very well be somewhat significant evidence.

Prescriptive linguistics isn't necessarily ivory-tower nonsense. Both of your examples of prescriptive bullshit were devised under much less-than-ideal circumstances: split infinitives were advised against by some at a time when the upper class happened to use them less frequently than the lower class in Britain (languages naturally diverge into registers or even dialects based on the most trivial of possible usage choices), and the suggestion somehow caught on like wildfire; as for preposition stranding, that proscription was just some idiotic posturing by John Dryden.

When I say (herein) that something is "not a word", I only mean that it is more than a word, i.e. is a phrase. I don't mean to dub it invalid in any way.

Hmm, how many "__娘" lexemes are there, and how do English speakers typically romanize them? That's actually a harder question to answer than what I had been considering with English's "__girl". We usually translate "娘" to "girl" when creating tags except in the case of names, which means we don't have too many examples.

Pixiv's version of our wiki system yields about 439 tags using 娘 , some of them are certainly simply phrases and not lexical items as we had been arguing over. Unfortunately that's sort of useless to us, since knowing the number of attested examples doesn't help us know how we would have romanized them. As for other groups that regularly romanize Japanese and strive to do it consistently, I can't think of any prominent sources to look at and search easily other than Danbooru.

This sort of leads us back to square one.

Also, despite my intuition and uses in general English, it seems on Danbooru we have been preferring spaces for *girl. We've even gone so far as to use school_girl when schoolgirl is a perfectly good word (EDIT: this might be due to a schoolgirlschool_uniform alias, we might want to look at this). Even here though, this doesn't set a precedence for spacing in romanized words.

I still have to argue against ikamusume/ika_musume not being a compound word or single lexical item, and I aesthetically like ikamusume better. Pretty much all my other arguments seem sort of moot though due to lack of evidence. That leaves me little more than "+1 ikamusume" without strong argument for or against either orthography.

While this is all interesting reading, what I'm taking away from it is that there's really no well understood, agreed upon rule governing this. So it more or less boils down to personal preference. Especially when we're dealing with concepts from a language that generally doesn't use spaces and can't offer much guidance on the matter.

My own preference here is no space.

1