Re-search
(a page for the light-hearted and forgiving)
The Language bit
I am mainly interested in language
as a faculty for expression of compositional meanings, i.e. grammar, in contrast to, say,
lexical meanings and how they are decomposed or derived.
The interesting bit in linguistic theorising
is that human languages exhibit limited semantic dependencies in their syntax.
We would like to know why.
The strong hypothesis in this respect is that languages differ only
in their lexicons. Common dependencies need not be stipulated in grammars, only the language specific ones.
From this perspective, infinity of languages (therefore recursion) is of secondary interest.
I am in favour of testing the strong hypothesis
before we entertain the weaker ones.
My high school science teacher told me to do that.
I usually don't do what I'm told but I make exceptions.
Lexicalising a grammar is crucial for this hypothesis.
The underlying idea in fully lexicalising a grammar is the notion
of "possible lexical category," as models of what
Edmund Husserl called
"sensibly distinct representations in the mind."
Many of us (radical lexicalists)
believe categories need explanation, rather than stipulation.
NB. these kinds of categories are knife edges: one side is syntactic, the other semantic. Any lexicalised grammar need do justice to both, unless we start
believing in one-edged knives. (The so-called one-edged knives kard, culter, facon etc. are knives
with one edge sharpened, since you asked).
I try to work towards a theory of the lexicon--whatever that is.
Something 'idiosyncratic' does not need a theory. Naturally, I don't
share such an impoverished and incidental view of the lexicon.
Rather, language is the infinite closure of the lexicon with
respect to an invariant (and finite) combinatory system.
(Having said that, I do believe language is a kludge; if I want
perfection in nature, I'd study sharks. Lexicon is language with small l,
and its combinatorial theory is language with big L.)
I guess what I'm saying is that a theory of kludge is a kludge too; it's
turtles all the way down.
We can conceive the lexicon as something shaped by the invariant.
The invariant can be studied semiotically (extensionally) and
psychologically (intensionally). The same goes for the lexicon.
But first we must account for Merrill Garrett's
insight that ``parsing is a reflex.'' (try turning it off
if you're a skeptic). The part and parcel of the strong
hypothesis is that this is due to having a combinatory system that
is purely computational and oblivious to world matters. No movement, no ghost in the machine,
no checking, no caching, no tampering, no tinkering. What you do with that computation in real life is, ehm, the real meaning of life.
(What about the mind, you say. I don't know.
These hefty global questions usually emanate from certain parts of
American East Coast. Ask them.)
More specifically,
i am interested in how combinatory and substantive constraints shape surface syntax,
and the lexical reflex of that effect. I am also
interested in
interactions in components--functionally speaking--of a language system:
morphology, syntax, semantics, prosody, information structure, what have you.
Recently, I have been studying grammatical relations, word order, directionality and categorisation
in the lexicon, intonation in grammar,
Bayesian sorcery for choosing categories, and morphosyntax,
based on a theory of syntax-semantics called
Combinatory Categorial Grammar
(CCG).
On the applied side, I am interested in
parsers for categorial grammars,
and modeling multidomain
interactions, such as generation of contextually appropriate discourse
entities, and syntax-morphology-phonology
trilogy in parsing.
Some public tools from applied research are
available below, at our lab,
and at Edinburgh-born
openCCG.
(some relevant papers)
The Cogsci bit
The major difference between trees and animals is that animals move and trees don't. Everything that moves has a nervous system. It seems that the whole need for the nervous system arose because things that move must coordinate their movement and actions. (If you are in doubt, try tying your shoelaces as you run).
Or it could just be a serendipitous accident to give us the mother of all neurons, in which case I will close shop and worship Taranis.
The point of cognitive science is to make sense of how coordinated activity can take place with what little perceptive abilities a species have. That's what David Hume suggested---well not in these words, and i'm a bit old-fashioned in this matter to leave the good words to their owners.
When things move, they must track other objects and coordinate their actions. (An inquiring mind might estimate the potential lifetime of a mouse that seems to totally ignore a curious or hungry cat.)
A simple hypothesis, aka. the computationalist hypothesis, is that all kinds of coordinate action are more of the same stuff. What distinguishes the species is their resource endowment and life training (i.e. exposure to data).
So maybe, just maybe, the most uniquely human cognitive trait, language, is more of the same stuff, with more resources and less training, rather than a gift to mankind or some kind of miracle. (read: the only miracle I believe in is the national lottery.) A bit of evolutionary patience might give us wonders, if you pardon the expression.
I blame Beckett for legitimising schoolboy humour in public places.
Some papers
- Bozsahin, Cem (DRAFT 1.0). Grammars, Programs and the Chinese Room.
(pdf, for comments).
-
(longer version to appear in 2006 International European Conference on Computing and Philosophy; ECAP)
- Bozsahin, Cem (DRAFT v2.0).
- Word Order, Word Order Flexibility and the Lexicon
(was `Lexical Origins of Word Order and Word Order Flexibility.')
- In preparation for a chapter in Theoretical Issues in Word Order,
S. Ozsoy (ed.), Kluwer. For comments.
(pdf)
- Bozsahin, Cem (DRAFT 1.0).
-
Directionality and the Lexicon: Evidence From Gapping.
For comments.
(.ps | .pdf).
- Ozge, Umut and Cem Bozsahin (2010). Intonation in the grammar of Turkish.
Lingua 120:132-175. pdf
- Zeyrek, Deniz, Umit Turan and Cem Bozsahin (2008). The role of annotation
in understanding discourse. ICTL 2008 Proceedings.
(pdf)
- Coltekin, Cagri and Bozsahin, Cem (2007). Syllables, Morphemes and
Bayesian Computational Models of Acquiring a Word Grammar.
-
Proc. of 29th Annual Meeting of Cognitive Science Society, Nashville.
(pdf)
- Bozsahin, Cem, Asli Goksel (2007). Turkce'de Ezgi: Sozdizim ve Edimle Iliskisi. 21. Dilbilim Kurultayi, Mersin. (doc)
- Bozsahin, Cem (2004).
- On the Turkish Controllee.
-
to appear in ICTL 2004 Proceedings
(pdf). for comments
- Tutar, Sercan, Cem Bozsahin, and Halit Oguztuzun (2003).
- TPD: An Educational Programming Language Based on Turkish Syntax.
-
The First Balkan Conference in Informatics,
(pdf). November, Thessaloniki.
- Bozsahin, Cem (2002).
-
The Combinatory Morphemic Lexicon.
Computational Linguistics, 28(2):145-186.
(pdf)
- Yuksel, Ozgur, and Cem Bozsahin (2002)
-
Contextually Appropriate Reference Generation.
Natural Language Engineering, 8(1):69-89.
(pdf|
ps)
- Bozsahin, Cem (2000).
-
Gapping and Word Order In Turkish.
Proc. of 10th Int. Conf. on Turkish Linguistics, Istanbul, August.
(ps)
- Bozsahin, Cem, and Deniz Zeyrek. (2000).
-
Dilbilgisi, bilisim ve bilissel bilim [Grammar, Computation and Cognitive
Science]. Dilbilim Arastirmalari 2000 [Research in Linguistics,
vol.11].
(ps)
- Sehitoglu, Onur, and Cem Bozsahin. (1999).
-
Lexical Rules and Lexical Organization. in Breadth and
Depth of Semantic Lexicons,
Evelyn Viegas (ed.), Kluwer.
(ps)
- Bozsahin, Cem (1998).
-
Deriving the Predicate-Argument Structure for a Free Word Order
Language. Proceedings of COLING-ACL'98, pp. 167-173, Montreal.
(ps)
- Bozsahin, Cem (1997).
-
Combinatory Logic and Natural Language Parsing. Elektrik,
Turkish J. of EE and CS, 5(3), 347-357.
(ps)
- Bozsahin, Cem (1996).
-
Ulamsal dilbilgisi ve Turkce [Categorial Grammar and Turkish].
Dilbilim Arastirmalari 1996 [Research in Linguistics] 7:230-244.
(ps)
- Bozsahin, Cem and Elvan Gocmen (1995).
-
A Categorial Framework for Composition in Multiple Linguistic Domains.
Proc. of the 4th Int Conf on Cognitive Science of NLP, Dublin
(CSNLP'95).
(ps)
- Oflazer, Kemal, and Cem Bozsahin. (1994).
-
Turkce Dogal Dil Isleme [Turkish NLP].
Proc. of Turkish Informatics Society TBD'94.
(ps)
- Bozsahin, Cem, and Nicholas V. Findler. (1992).
-
Memory-based Hypothesis Formation.
Cognitive Science, 16(4):431-454. (3 figures missing).
(ps)
Some public-domain tools
-
Open CCG
- This is public software for developing CCG grammars and lexicons.
- It has several English systems, and small demo systems for
Dyirbal, Basque, Tagalog, Turkish and Inuit.
- The morphosyntactic CCG below is re-implemented in openCCG by Gunes
Erkan.
-
Morphosyntactic CCG (v. 0.4)
- Software: tested with SICStus Prolog on Solaris;
but likely to work on others as well.
- Features: CKY parse engine; application and harmonic composition rules;
Type raising in grammar; S, N, NP basic cats; / and \ slashes (no |);
Morphemic lexicon; Morphosyntactic and attachment modalities;
A tiny English lexicon and a Turkish lexicon
(inflections, plus a sampler of nouns, verbs, adj etc.)
- to unfold the tar file: gunzip msccg0.4.tar; tar xvf msccg0.4.tar
- It is provided solely for research purposes.
Some talks
- Schonfinkel'den Dilbilime Anlam ve Dizim (Semantics and Syntax From Schonfinkel to Linguistics). ODTU Felsefe Bol. 25. Yil 'Anlam' Kongresi. 19.12.2008
- Meaning, form and adjacency: Schönfinkel's legacy (Ankara Linguistic Circle, 10.10.2008) (pdf)
- What do we parse when we parse? (Bogazici Univ. Linguistics Colloq.,
3.4.2008) (pdf)
- Computationalism as a philosophy of science in cognitive science.
(METU Philosophy and Cognition Workshop, 8.3.2008) (
pdf)
- Turkce'de Ezgi: Sozdizim ve edimle Iliskisi (Turkish Intonation: Its relations to syntax and pragmatics-- with Umut Ozge and Asli Goksel). Mersin XXI.
Dilbilim Kurultayi, 10 Mayis 2007)
- Dil ne degildir? (What language is not) (Abant Izzet Baysal Universitesi,
Psikoloji, 23.3.2007)
- Type-dependence of Language (Ankara Linguistic Circle, March 2007)
- Two notions of category in linguistics: Some (really naive) Algebra
(METU Applied Mathematics Colloq. March 2007)
- Lexical Integrity and Lexical Organisation
(CL and Phonology Colloquium, Saarbrücken, 19.1.2006)
- Language from the lexicon (Cognitive Science Colloquium, METU Ankara, 21.10.2005)
- Kultur oncesi Dil:
Niye cocuklarin bazi 'yanlislari' baska bir dilde 'dogru' cikiyor,
bazi 'yanlislari' da hic yapmiyorlar? (Gercek Seminerleri, Ankara 8.12.2004)
(duyuru | sunum)
- Zihinsel Sozlukte Dilbilgisi (Hacettepe Dilbilim, Ankara, 5.11.2004)
(ps)
- What's in a Lexicon ? (METU CS Colloq., March 2004)
(pdf)
- Control and Grammatical Relations (Paris, Octobre 2003)
(pdf, some material outdated, see Ed'04)
- Lexical Origins of Word Order and Word Order Flexibility
(Edinburgh Linguistic Circle, February 2003/Antwerp Typology Sem. March 2003)
(pdf)
- Inflectional Morphology as Syntax (Edinburgh ICCS/HCRC, Octobre 2002)
(pdf, similar in material to METU CS'04)
- (Yapay) Zeka ve Dil (ODTU, 9.11.2001) (doc)
Computer Engineering Department
Middle East Technical University
06531 Ankara, Turkey
tel: +90-(312)210-5580
fax: +90-(312)210-1259