Monday, December 1, 2014

Forest and Land Cover Survey Fieldwork


The second component of my fieldwork consisted of two parts.  One was forest surveys, to measure the health and diversity of local forests, and the other was land cover surveys, to see how accurate the my land cover classification using satellite imagery was.  To do these surveys, I hired local guides and taught them how to use forest survey equipment like DBH tapes and laser rangefinders for measuring tree heights.

Here is Drissa measuring the boundaries of a plot. Drissa is an expert hunter and knows the area around Kissa like the back of his hand.

This is Omar, using the laser rangefinder to measure the height of a nearby tree.

This is Amadou, measuring the DBH (Diameter at Breast Height) of an Isoberlinia tree. Amadou is an excellent student and is almost conversational in English. He had to quit school when his father passed away and his family was no longer able to pay his tuition fees. He is also the only person in Kissa who could consistently beat me at the Connect-4 set that I gave to the village.

The Landscape - A Forest-Savanna Mosaic

Despite having a relatively uniform rainfall distribution latitudinally, Mali's ecosystems are extremely heterogeneous. This is because there is no "climax community" that would eventually dominate in the absence of disturbance. Rather, regimes of grass and forest compete at a landscape scale, based on disturbances like fire, draught, and human activity. 

In the Eastern US, for example, forests will dominate when given enough time.  If you clear a plot of land, first grasses will grow, then shrubs, and eventually trees.  If you do not disturb the area and come back in a hundred years, trees will still be there, because they are the climax community for that ecosystem.  Most ecosystems of the world have a climax community, but at the boundary between arid grasslands and tropical rainforests, there is no climax community, and both grasses and trees can potentially dominate. Thus, if you clear a plot of land in southern Mali, halfway between the grasses of the Sahel and rainforests of tropical west africa, in two hundred years, you are just as likely to find grasses as you are forest.

 Here are some of the grasslands I'm talking about.  At the end of the rainy season, the grass can be over your head!

 And here I am standing next to a Kapok tree in a heavily forested area near Kissa.

In the two pictures above, there was nothing specific about those particular plots of land that "determined" whether the plant community there would be trees or grasses.  In a hundred years, the land in the first picture could look like the second picture, or vice versa.

Human Impacts on the Landscape

The boundary between forested areas and grasslands shifts based on disturbance. Things like fire, severe draught and flooding can cause a once forested area to become grassland, or can make trees sprout where grasses used to dominate.  However, human actions are probably the largest drivers of land cover change in Mali.  Areas that were once farmed turn to forest, as the tilled soil allows trees roots to grow quickly and they gain an advantage over shallow-rooted grasses. At the same time, people frequently burn grasslands at the end of the dry season, which gives grasses an advantage, as they are more fire-adapted than trees.  In fact, people have been impacting the landscape in southern Mali for so long, it is difficult to find examples what a "purely" natural disturbance regime would have looked like, although the african megafauna that are no longer found in Mali likely used to play a large role.

 Here is a fire set to a grassland. We were just walking down the trail when my friend paused and said "lets burn this". The fire will stop when it gets to the forest boundary, although it may penetrate the forest a little, extending the grassland's range.

 And here is a grassland that has already been burned. Malians burn grasslands because they are much easier to traverse this way and it is easier to spot game.

 This area, on the other hand, used to be farmland about twenty years ago.  Now, saplings are taking over, shading out the grasses.

This area is an old homestead, where people lived maybe fifty years ago. The fields that people farmed in the immediate vicinity of their houses encouraged forests and the fruit trees that they planted in their household shaded out grasses and fires, leading to this magnificently dense forest.

Sunday, November 30, 2014

The Buru, a Malian Trumpet

During Seliba this year, my friend Omar told me that they would be bringing out their "Burus", large wind instruments. I had never seen or heard of one of these before, so I was really surprised that they even existed.  Everyone knows how drums and percussion are fundamental to African and Malian music, and after spending a year in Mali I was also familiar with stringed instruments like the Kora and the Donsongoni.

Here is Dowda showing how a Buru is played.

It is buzzed like a brass instrument, and it only can play one note.  This is probably why Kissa has a whole rack of Burus - so that each one can hit a different note.  They are made out of wood, with leather tied around them.  Here are some shots of the Burus:



When they finally pulled out the Burus, drummers showed up and a couple of people grabbed one and started playing. But, it ended up being quite chaotic, although a rhythm started to form. Here's a 20 seconds of some Buru honking:


Later they told me that only really old people know how play the Buru or dance to it, but now they are too old to do it. In other words, the traditions around Buru playing are being lost, and when the young people try to play these days, they can only toot uncoordinately. It used to be used for a variety of things, but most notably for funerals.  This makes sense, given its somber yet wailing tone.  Someone showed me some grainy cellphone footage of elders from a nearby larger village called Goroko playing the Buru, but it seems like the people of Kissa are losing much of their cultural heritage surrounding the Buru.

The Buru and the Senoufo

After people grew tired of the Burus and unceremoniously stopped playing, they went to give them back to the old man who guards them (although they belong to the entire village).  I asked the old man if I could look at all of the Burus more closely, and I noticed that two of them had distinct carvings.  One of them had two little figures, so worn that I couldnt really make them out.

Another had a very ornate carving, which I recognized as a hornbill, sacred to the Senufo people.  I had just recently seen several Senufo hornbills in this exact style in a museum in Sikasso, the Senufo homeland.

 Here are some larger Senufo hornbills from museums that are clearly in the same style as the one on the Buru: upright, short wings, and with the beak down the middle.

                 Source: Wikimedia Commons

 Here are some more shots of the hornbill carving:

This is fascinating, because the people of Kissa haven't been Senufo for quite a long time.  They tell me that they used to be in the distant past, but have given up the Senufo culture and language to become Jula.  In fact, no one could tell me what the figure on top of the Buru was, or what it represented, although this would be obvious to a true Senufo.  This conversion may have happened in the late 19th century when the Wassoulou Empire of Samori Turé was taking ground from the Senufo Kénédougou Kingdom of Tieba and Babemba Traoré, and Kissa was right on the boundary. People still remember how Samori Turé took over the area. He was laying siege to the local stronghold of Goroko, but could not get past their massive walls. Finally, he bribed the traitorous gatekeeper, and took the city.  This brought the whole area of Yorobougoula under Samori's control.  Within his empire, he strictly enforced the Muslim religion and the Maninka language.  It is said that in the empire's historically Fula homeland of Wassoulou, anyone caught speaking Fula would have their tongues cut out. So, the people of Kissa likely lost their Senufo language and culture in during his conquests in the 1870's - meaning that the burus are older than that!

Guns in Mali

Modern Guns

The village of Kissa has a strong tradition of hunting, and many households own some sort of firearm.  They distinguish between two types of gun. One is a modern gun, like a western rifle or shotgun, which they load with pre-made cartiges. These guns and their cartridges are available in the markets, and are probably made in China, just like everything else in Mali.  Here is a picture of a hunter with a modern gun:

 "African Guns"

The other kind of gun they have are old fashioned flintlock rifles. They make the bullets and gunpowder themselves, and have to load the rifle from the end of the gun. They call these "african guns" because they are not available in the markets - pretty much the only way to get one is to inherit it.  They insist that there are blacksmiths who make these guns by hand, but no one could name a blacksmith who made them, or even another village where you might find such a blacksmith.  I think these guns are really old and they were not made in Africa.  I think they come from pre-colonial contact with Europeans during the slave trade, which means they could be hundreds of years old. On a recent trip to Mali, I took some pictures of two of these "African Guns", and I'm hoping someone out there on the internet can tell me more about them, where they come from, and how old they might be.

Gun #1

This gun seems to have some sort of serial number, which means I was definitely not made by a Malian blacksmith. Maybe this number could yield some info about this particular guns history?


Gun #2


I'm gonna post this around Reddit and maybe some history and firearms forums to see if anyone can tell me any more about them. I would love to know how old these guns are, and how so many of them might have would up in Kissa, probably 500km from the coast.  Anyone who knows about this stuff, please share!

Wednesday, October 1, 2014

Back in Mali!

Hello all! I am back in Mali to do research for my Master's thesis. It has been so wonderful to be back here in Mali with all of its craziness and joyfulness. Part of my work was in my old Peace Corps village of Kissa, and it was kind of a dream to be back, chatting with all my old friends, and hiking through all the old forests and fields I used to explore.  I have been resolute about taking lots of pictures this time, as I did not take nearly enough during my Peace Corps stay. So here are some of the more interesting and illustrative pictures.

 This is Toh, pretty much the national dish. You take a handfull of the goop, made from millet or corn flour and kind of textured like polenta, and then you dip it into one of the bowls of sauce. The sauce is made from combinations of okra, peanutbutter, tomatoes and hot peppers. With a good sauce, Toh can be delicious, and I thoroughly missed it while I was away.

This is how Malians do tea: with two shot-sized glasses, two little tea pots, and a lot of sugar! The tea is boiled down into a thick, syrupy shot, and one round of tea can provide about 3 to 6 people with a sip. Usually there are 2 or 3 rounds, taking place over the course of a conversation-filled hour. There is lots of pouring the tea back and forth, to thoroughly mix it in and to cool it off.

 Here is a hunter playing a hunters-guitar (donsongoni). There was a big celebration in Kolondieba for Malian independence day, and lots of hunters came in traditional garb with guns and musical instruments. I was invited to sit up with the Mayor, and the hunter was going around and singing to each person so that they would give him some money. He came straight to me, figuring the American would have the most money. I snapped some pictures and gave him some change. You put it directly in the guitar, actually, and the hunter rattles it around and makes it a part of the instrument.

 Here's an interesting picture: a satellite dish, surrounded by Mango trees, mud huts and thatch roofs. I was out for a walk and I came across this bugu-da, a household out in the middle of nowhere. Often people will choose to live out here because of the virgin soil and empty space, which helps to grow more productive crops and raise more cattle. I've noticed that these people tend to be more wealthy, as you can see that this particular farmer, Lassina Kone, was able to buy a satellite dish and a color TV! Lassina was very friendly and curious about America, and offered me some yams to take with me.

This is one of my favorite pictures of one of my best friends, Adama. He is both very curious and very informed about the world, and loves when I get National Geographic magazines sent from home. Here, he is using an inflatable globe that I brought to explain to some people how it is the earth that moves, and not the sun. This is actually a pretty controversial subject in my village, but luckily Adama and his new globe should help settle the debate.

My Research...

I got a grant from the West African Research Association to look at malnutrition in Mali, and how it is correlated with other factors like cotton production and environmental degradation. I used a map that I published before on this blog to apply for the grant, and I think it explains a lot of the context of my research. I will be doing work in three different villages, and for each village I will be doing household surveys, as well as forest and land cover surveys.  The idea is to see if healthy forests and certain livelihood strategies (like growing cotton) have any significant relationship with patterns of malnutrition. Here are some picture from the research.

Collecting a list of all of the household heads' names from the village secretary, Lassina (black hat). I randomly picked from the list to determine which households to interview. Also pictured are my good friend Oumar, who worked with my during my Peace Corps service, as well as my host Amadou, in the blue shirt. Lassina and Oumar were both incredibly helpful when I showed up and explained what I had to do.

Here are some shots of me conducting interviews. They were taken by Oumar, who really enjoys using the camera!

 This is me measuring Mid-Upper Arm Circumference, a good indicator of a child's overall health. For all the children in each study household, I have to measure their arms. Often they are terrified, having never seen a white person before. To make the experience less traumatic I give them candy.  Also, notice in this picture, someone in the background wearing a shirt that says something in English. There is a 0% chance she knows what that shirt says.

Finally, I give the kids candy. Actually, these are those vitamin-fortified candies you can get in America. If the kids have really skinny arms, I give their moms a couple, and tell her to feed the child one a day.

So that's my research so far! I can't wait to start the forest surveys.

Sunday, August 10, 2014

Ebola Prevalence

I am leaving in three weeks to do my research for my Masters thesis in Mali. I can't wait. However, something that has been on my mind lately is the ebola outbreak in nearby Guinea, Sierra Leone and Liberia. The media sure is talking a lot about it, and my family and friends are quite worried about this epidemic in West Africa.

But how much of an epidemic is it really? Well, there are about 21.6 million people in Guinea, Sierra Leone and Liberia, the three countries most affected by the outbreak. In those countries, there have been 959 Ebola deaths as of August 8th, according to the Word Health Organization. That means that 0.0044% of the population died from Ebola since the start of the outbreak, in March 2014.

To compare that to US statistic, we had 32,482 fatal car crashes in 2011. Given our population of 316 million, in a five month period, the average American had a 0.0048% chance of dying in a car accident in 2011.

That means an American was just about as likely to die in a car crash in a 5 month period as a West African from Guinea, Sierra Leone or Liberia was to die from Ebola in the 5 months since the outbreak began.  Fatal car accidents are a real problem in America, and everyday we do things to minimize the chances of such a car accident happening to us - we drive carefully and soberly, and we wear seat-belts.  Similarly, Ebola is a real problem in West Africa, yet there are things you can do - I that I certainly will do - to minimize your exposure and make it a manageable risk.

Ebola is dangerous, and something the world should deal with quickly and decisively. But it is not rampant, just a fatal car accidents are not rampant here in America.

Monday, July 21, 2014

The Gaza strip is smaller than the micronation of Andorra, and it's almost impossible to get out.  Israel controls every exit but one, and they are heavily blockaded, creating a humanitarian crisis.  A good map of Gaza's possible border crossings was made by The Palestinian Academic Society for the Study of International Affairs (PASSIA) and is here. The one border crossing into Egypt, the Rafah crossing, recently opened up to admit only injured Palestinians into Egypt. This border crossing and the nearby illicit tunnels may be what is driving much of the conflict. Israel wants to shut down the border crossing and nearby tunnels to cut off Hamas's access to supplies and therefore their ability to govern, whereas Hamas maybe be hoping to leverage the conflict to pressure Egypt to make the border crossing more open.

Being able to travel freely is one of the greatest privileges afforded to citizens of the first world, clearly illustrated in this map. This privilege is completely absent for the citizens of Gaza, trapped in a war in and a scant 139 square miles. To illustrate this, I took screenshots from MapFrappe of the area of Gaza overlaying different familiar areas of the world. Imagine being stuck somewhere a third the size of New York City - with rockets falling!

Sunday, July 13, 2014

Philippines Language Maps



I love language maps, and looking at how languages interact with space.  Much of the history of human movement through space can be re-constructed using linguistics and language maps.  For example, linguistic data shows that the Malagasy of Madagascar have their origins in Borneo, Indonesia. Or that people as far flung as the Irish, Persians, Spanish, Armenians, Germans and Punjabis all speak related languages, and they all have cultural and ancestral roots in one group of people that once lived somewhere near the Black Sea.

However, even with a good understanding of where a language or group of languages are, they can be very difficult to depict on a map. Languages frequently overlap, or exist as linguistic continua that cannot be categorized into distinct languages. Language cartographers will often follow political boundaries, usually incorrectly.  This map, for example, makes it appear that the use of English abruptly ends and Spanish begins at the US-Mexican border.  It also looks like you are as likely to find French speakers in the far north of Quebec as you are in Montreal.

Another major issue with language maps is that they usually rely on perceptual data, but not on real observations of languages "in the wild".  We all know that there is a boundary between Southern American English and Northern American English, but where exactly would you put the line? At the mason-dixon line? Well, no one really has a southern accent in DC or Baltimore, so what about somewhere across Virginia? It's tricky.  The best solution, which linguistic geographers have been doing for years, involves large-scale surveys, asking people what they "would" say, recording where exactly they are, and then aggregating this data.  This has led to some cool maps, like these ones, but is incredibly time- and labor-intensive.

I believe that recently a new solution has emerged to these problems in mapping languages and dialects.  In the past few years, geotagged social media have become widely available, offering massive and readily available data sets for mapping everything from linguistic trends to sports fan domains to preferences for church vs beer.  Maps made from such large, geotagged, linguistic corpora show real occurrences of linguistic phenomena, rather than just perceptual linguistic boundaries.  Additionally, because such data is available in point form, it makes it much easier to display overlapping languages and linguistic continua.  So, I decided to take a crack a this, and mapped the languages of the Philippines using tweet data:



Collecting Tweets

To collect the tweets, I used the R package twitteR, a wrapper for the twitter API.  I divided the Philippines into 1036 evenly-spaced points, and searched for all tweets within a 10 mile radius of the point, covering the whole area of the Philippines. I ran this every night for 5 days until I had one million tweets, of which about 25% were georeferenced.


Collecting Corpora

In order to identify the language of the tweets, I needed corpora.  A linguistic corpus (singular of corpora) is a large body of text in a given language.  This large body of text is used to generate data, usually statistical signatures, that can be used to determine if a given sample text (like a tweet) has the same features as the corpora.  So, in order to tell if a tweet is in, say, Hiligaynon, you need a lot of samples of Hiligaynon.  To build these corpora, I got samples of literary and religious texts in Tagalog, Bikol, Ilokano, Hiligaynon, Pangasinense, Kapampangan, Cebuano and Waray from SEAlang. However, people speak quite differently in a religious or literary setting than they do while tweeting informally, so to get more "modern" samples of each language, I also built corpora by web scraping from Wikipedia using Python's BeautifulSoup package.  I made a script that collects the body text from random Wikipedia articles (using the "Random Article" link), and I also made a script that starts at the page for the Philppines and collects text from every page that it links to, and then every page that those ones link to, etc.  I used both scripts until I generated a corpora of 300,000 words for every one of the aforementioned languages except Hiligaynon, as well as for Chavacano de Zamboanga, which SEAlang did not have a corpus for. Hiligaynon has a beta-wiki with a couple hundered pages, and I used every single one of them to create a considerably smaller corpus.


Identifying Languages

Most language identification algorithms work by taking 3 and 4 letter samples of the corpus (called 3-grams and 4-grams, or just n-grams) and determining their distribution of the frequency of their occurrences.  This is done for multiple languages and corpora.  These n-grams are also sampled from the text to be identified, and the distribution of n-grams from the sample text is compared to the distribution of n-grams in the various corpora. Whichever corpora's distribution of n-grams most closely matches that of the n-grams of the sample text is determined to be the language of the sample text.

However, this method fails miserably for Austronesian languages, and exclusive word lists must be used.  This is because Austronesian languages have relatively small phonemic inventories (Hawaiian only has eight consonants!) and almost all have a simple, Consonant-Vowel syllable structure.  Thus there is not enough variability in possible n-grams, and languages cannot be classified based on n-gram distributions.  Google translate uses the n-gram method, and when I use it for Tagalog, it frequently thinks that the text I am entering is in Indonesian or Cebuano.  A more relevant example of this is in other attempts to map work languages.  Two high-res language maps of twitter exist here and here. Outside of the Philippines, they are fantastic (just look at Europe), but I am assuming that they use the n-gram method, because they both mis-identify Tagalog (and other Filipino languages) as Indonesian:




Comparison to Other Maps

So here is my final map placed next to a map of the languages of the Philippines from Wikipedia.  The map from Wikipedia does a  good job of quickly showing where one might find minority languages, but it does a bad job of showing location precisely, specifically with regard to density.  It also does a pretty bad job of showing where languages overlap.  It is clear in the tweet map, for example, that while the minority languages are confined to certain regions, English and Tagalog are prevalent throughout the country.  This is unclear in Wikipedia's map, which makes it appear as if Tagalog is just one of many minority languages, when in fact it is far more prevalent.  The tweet map also does a much better job of displaying language density.  Ilocano, for example, is most common on the northeast coast of Luzon, is much less common in north-central Luzon (because of the many small languages spoken in the mountains) and is also uncommon on the northwest coast of Luzon (because the whole area is sparsely populated).  This is clear in the tweet map, where most of the Ilocano tweets are on the northeast coast and only scattered tweets appear on other regions, whereas the Wikipedia maps makes the Ilocano area appear uniform.  The Wikipedia map does add the diamond indicating that a language is only a plurality - but it does not indicate which other languages are present or where a given language is concentrated, like the dot map does.


English vs Tagalog

Clearly, English is widely used throughout the Philippines, as is Tagalog.  However, it appears that that Tagalog and Taglish (code-switching between English and Tagalog in one tweet) are more common in the North, the homeland of the Tagalog language.  I was told by many Filipinos that southerners, who speak mostly Cebuano, resent the fact that Tagalog became the national language, since Cebuano was spoken by more people over a wider area.  For this reason, Cebuanos use less Tagalog and more English.  This map seems to confirm that, and taking the median point for Tagalog, English and Taglish shows a trend of Tagalog being more common in the north and English being more common in the south.  Nevertheless, English was much more common than Tagalog, with 80,000 English tweets, 28,000 Tagalog tweets and 16,000 tweets with at least 3 words in Tagalog and 3 words in English.


Minority Languages 

Here is a map of the minority language tweets.  As I discuss later, there were issues with language identification for these minority tweets, so their relative totals are probably not indicative of how prevalent they actually are on twitter, much less in everyday, spoken usage.  Nevertheless, I think this map does a great job of displaying their geographic distributions.  Hiligaynon (also called Ilonggo) is accurately shown as occurring on both Panay Island and Negros in the central islands (called the Visayas), as well as in the east of the southern island of Mindanao.  Cebuano was also accurately identified across its full range, despite the fact that is has many different varieties, indicating that the corpus drew on a good variety of texts.  This was not true for Bikol, which also has many varieties.  Because the corpus from Wikipedia was based on Central Bikol, also called Naga Bikol, both of the Bikol tweets identified were found near Naga, and not in the larger city of Albay, which speaks a slightly different variety. 

Many of the minority languages were very under-represented on twitter, with only one tweet coming in from Waray, two from Bikol, and nine from Kapampangan.  In addition, no tweets were found in Pangasinse, spoken between Kapampangan and Ilocano, or Chavacano de Zamboanga, a creole based on Spanish and local languages, spoken in Mindanao.


Issues with Language Identification

One major issue with this study was that the corpora and and tweets were written in quite different registers.  Filipinos use different words when they are writing tweets than when they are writing the bible.  And even when they are using the same words, they spell those words differently.  This is especially true for minority languages, which are almost entirely spoken and rarely used in a formal literary setting.  English, on the other hand, is the main language used at school, business and in other settings that involve a lot of writing.  Thus, Filipinos are much more likely to use "proper" English than they are "proper" Tagalog when writing tweets, and they would almost never never use the "proper" spelling and vocabulary of their minority languages.  However, my corpora were based on the "proper" versions of these languages, and thus minority languages are quite underrepresented, and my study does not represent an accurate measure of, say, Waray usage versus English usage.  Only about 125,000 of tweets 250,000 geotagged tweets could be classified, and I believe that most this missing half represent tweets written in "improper" Tagalog and minority languages (as well as a couple tweets in completely different languages, like Chinese or German).

For example, to say "I have not yet been to that new SM" would be, in proper Tagalog, "Hindi pa ako nakapunta sa yung bagong SM".  However, a Filipino using social media would likely write something like "Di p aq nkpunta sa yn bago SM".  This manner of writing has been taken to an almost incomprehensible level in the cultural phenomenon of jejemon, (similar to leetspeak in English) and cannot be identified as Tagalog from a corpus in "proper" Tagalog.  Slang is also quite common in spoken Tagalog and in spoken minority languages, and is impossible to identify from a "proper" corpus.  In fact, there are whole dialects of Tagalog based on slang, such as this code language used by gays. One Bikol word I learned when I was living the the Philippines was "uragon", which means strong or manly.  However, this word was not in the Bikol corpus I generated: nowhere in the Bible, in literary texts or on the Bikol Wikipedia is it used.

Finally, the end result of the language classification required a great deal of cleaning and spot checking.  This was because many of the minority languages' corpora contained words that were also in English slang, so many English tweets were misidentified as minority language tweets.  For example "haha" is a word in Waray (or at least, is a word used in the Waray bible, Waray literary works, or the Waray Wikipedia) and "omg" is a word in Kapampangan.  Since I decided that three matching words makes a tweet fall into a given language category, the following tweets were classified as Kapampangan and Waray:

"ge tawa tayo :( haha haha haha""

I had to do a lot of spot checking and re-running the classification before I ended up with results that I was satisfied with.



I think there are a number of ways I could do this project better, if I ever wanted to use this for an academic conference or paper.  A larger twitter data set could help capture more tweets from the more obscure minority languages.  However, better language classification techniques could definitely make major improvements.  I think the best way would be to build corpora from tweets themselves. To get this, I'd have to find native speakers from each minority language, give them a data set of a couple thousand tweets, and have them identify the ones that are in the minority languages they know.  Then, these training tweets would be used to classify, say 1 million tweets.  Another possibility would be to run an "unsupervised" classification.  With this method, there are no training data set or corpora, but rather all of the tweets would be sorted into natural groups based on certain statistical features.  I have done this before with pixels in remotely sensed satellite imagery, but I am not exactly sure how to go about this for text data, or if it has ever been done before.

I would also like to use this method that I have just developed in other places outside of the Philippines.  I think Indonesia and Malaysia would be good candidates, as they are other areas with high twitter penetration, many minority languages, and Austronesian languages that cannot be classified with the n-gram method.  Another possibility would be to map the languages of tweets in various international cities with a lot of linguistic diversity, like Singapore, London, New York and Hong Kong.