Codelog

foreach(Snippet aSnippet in CodeLog){ aSnippet.GetSolution(); }

Archive for the ‘nlp’ tag

Singular form of a word in python

with 5 comments

Last post I talked about using list comprehensions and promised to post a simple method to find the singular form of a word. When I wanted to write such a function, I initially thought of using a chain of if-elif-else constructs. But I didn’t like it that way. So I used lambda functions and list comprehensions to do the same job.

def singularize(word):
    """Return the singular form of a word
 
    >>> singularize('rabbits')
    'rabbit'
    >>> singularize('potatoes')
    'potato'
    >>> singularize('leaves')
    'leaf'
    >>> singularize('knives')
    'knife'
    >>> singularize('spies')
    'spy'
    """
    sing_rules = [lambda w: w[-3:] == 'ies' and w[:-3] + 'y',
                  lambda w: w[-4:] == 'ives' and w[:-4] + 'ife',
                  lambda w: w[-3:] == 'ves' and w[:-3] + 'f',
                  lambda w: w[-2:] == 'es' and w[:-2],
                  lambda w: w[-1:] == 's' and w[:-1],
                  lambda w: w,
                  ]
    word = word.strip()
    singleword = [f(word) for f in sing_rules if f(word) is not False][0]
    return singleword
 
def _test():
    import doctest
    doctest.testmod()
 
if __name__ == '__main__':
    _test()

This method is simple if you know the rules of plurals in the english language. I have converted each rule into a lambda function which returns the corresponding singular word depending on the word ending. The order of the rules are important, for eg., i should first check for -es before checking for -s. I haven’t taken care of the special plurals like men-man, people-person, children-child.

I then use list comprehensions to apply each function on the input word and pick the 0th element. Now you get the singular for of the word.

If you notice in the function, I have used something called doctests which is a very easy way to test your functions. You can download the source code and run it to run the doctest. Wow, now I got another topic to write about in the next post - doctests in python.

Written by cnu

July 27th, 2008 at 1:48 pm

Posted in Python

Tagged with ,