Singular form of a word in python

Last post I talked about using list comprehensions and promised to post a simple method to find the singular form of a word. When I wanted to write such a function, I initially thought of using a chain of if-elif-else constructs. But I didn’t like it that way. So I used lambda functions and list comprehensions to do the same job.

def singularize(word):
    """Return the singular form of a word
 
    >>> singularize('rabbits')
    'rabbit'
    >>> singularize('potatoes')
    'potato'
    >>> singularize('leaves')
    'leaf'
    >>> singularize('knives')
    'knife'
    >>> singularize('spies')
    'spy'
    """
    sing_rules = [lambda w: w[-3:] == 'ies' and w[:-3] + 'y',
                  lambda w: w[-4:] == 'ives' and w[:-4] + 'ife',
                  lambda w: w[-3:] == 'ves' and w[:-3] + 'f',
                  lambda w: w[-2:] == 'es' and w[:-2],
                  lambda w: w[-1:] == 's' and w[:-1],
                  lambda w: w,
                  ]
    word = word.strip()
    singleword = [f(word) for f in sing_rules if f(word) is not False][0]
    return singleword
 
def _test():
    import doctest
    doctest.testmod()
 
if __name__ == '__main__':
    _test()

This method is simple if you know the rules of plurals in the english language. I have converted each rule into a lambda function which returns the corresponding singular word depending on the word ending. The order of the rules are important, for eg., i should first check for -es before checking for -s. I haven’t taken care of the special plurals like men-man, people-person, children-child.

I then use list comprehensions to apply each function on the input word and pick the 0th element. Now you get the singular for of the word.

If you notice in the function, I have used something called doctests which is a very easy way to test your functions. You can download the source code and run it to run the doctest. Wow, now I got another topic to write about in the next post – doctests in python.

Tags: ,

5 comments

  1. That was really great. Python really rocks. I wonder the output for teeth, feet, etc. I think you should also add something like

    lambda w: w[1:3] == ‘eet’ and w[1:3] + ‘oot’,

    Something like that I guess

  2. I woudn’t suggest relying exactly on the characters 1:3 as I have seen many places where people don’t care about grammar and they end up with words like “ugly-teeth”, “duck-feet” and it would fail here. For such improper plurals, its better to have a dictionary.
    Anyway that was a nice example where you can just append conditions which you want to check (even during runtime), which would be very difficult if you had a if-elif
    If you want to remove all the extra cruft of a word and get only the root word – I would suggest using a proper stemmer. There are many good ones – Porter Stemmer, Lovins Stemmer, etc. Stemming is standard procedure used widely in NLP.

  3. nice job.
    A helpful tutorial for beginners.

  4. Nice one…

    Thanks

    Anoop