Coursera – Deep Learning

Online courses are very important tools for improving our Data Science skills. 2 years ago, I followed Machine Learning course of Andrew Ng and I learned a lot from Prof. Ng’s clear teaching style. Although there was a Neural networks chapter in that course, it was Octave programming language based. Now, I have started to follow Deep Learning course of Prof. Ng, which is shorter than the previous one. Actually, it takes only 4 weeks to complete the course. And one of its advantages is that it is Python based, which is the most popular Data Science programming language today. The instructors have prepared Jupyter notebooks that are running on the website of Coursera, which simplifies the programming part without requiring to install packages to the local computer, etc. Now, I completed Week 3, and I strongly to recommend anyone who is interested in Data Science to register for this valuable online course.

 

Comparison of two face recognition software: Clarifai and Face++

Recently, I tried several products to extract demographic information from a profile image. My target was to obtain information about age, gender, and ethnicity. I found the prominent companies in the sector are Clarifai and Face++. I integrated my trial software with both products and I found Clarifai’s accuracy better than Face++. My reasons are:

  1. Clarifai provides the probability value of its predictions. (predicted gender is female with a probability %52) So, it is possible to eliminate the results having low prediction score. On the contrast, Face++ does not provide that value. This is an unwanted situation because, in binary classification technique, the prediction always has a result, even its score is not very high.
  2. Clarifai correctly predicted the ethnicity of the image below as “White”, while Face++ wrongly predicted it as “Black”. But on the other hand, Clarifai could not found the gender value correctly (female %51, male %49) while Face++ correctly marked it as male (we don’t know its probability).
  3. The disadvantage of Clarifai is its low quota for free usages. It permits only 2500 API calls per month for free accounts. But Face++ does not specify any upper limit for free accounts. It has only one single limitation, which is one single API call per second.

I hope my hands-on experience with these services will help you choose the right product.

 

 

Result of Clarifai: (https://clarifai.com/demo)

Gender: feminine (prob. score: 0.510), masculine(prob. score: 0.490)
Age: 55 (prob. score: 0.356)
Ethnicity (Multicultural appearance):  White: (prob. score: 0.981)

Result of Face++: (https://www.faceplusplus.com/attributes/#demo)

Gender: male
Age: 53
Ethnicity (Multicultural appearance): Black

Converting texts to high-res images

A very inspiring research is made at the end of 2016. With the help of deep learning, now it is possible to generate images from given texts.

Here is the link to the news and here is the link to that research paper.

Could you imagine some use cases based on this technology? I found an interesting use case.. Imagine you are in a police station, about a robbery occurred in a bank… The thief could not be found and you explain the visual profile of thief as you are the unique eyewitness of this event. At that time, a computer automatically generates the image of thief based on the visual details you describe… At the same time, the computer increases the precision of that visual by matching it with other records of past robbery events.

The Future of Education

Within the last month, the future of education was one of the main topics in Davos. There were very interesting debates, and in of them, Jack Ma (the founder of Alibaba) told that it is strongly and urgently needed to change the current education system due to the rising impact of robots. Since robots are able to obtain the knowledge, by learning from their past experiences, they will do most of the things people do today. In order to adapt ourselves to the modern world, we need to educate our children in a way that cannot be copied by robots. Rather than teaching mathematics or physics to our children, we should support their more humanistic skills such as music and art.

I agree with Jack Ma’s ideas and I think we need to think more about people’s main advantages and disadvantages over robots in the next 20 years. Today, our children start learning to code in primary school, in order to communicate better with the robots and understand their logic. But when the world will be dominated by robot activities, all the things will be changed and humans should be in a place where robots do not see them as a threat.

 

 

Understanding Feelings and Behaviours of English People about Brexit Referendum

In these days, my research motivation is to find some insights by analyzing Twitter data to understand how English people react to Brexit referendum. There are various researches already made about this topic, and most of them are done by universities in England such as Imperial College London and the University of Bristol. I found it as a quite interesting research topic since social media is an important environment to present our ideas to the community and there is a need for more research to understand people’s opinions. I will give more detailed information about my study in the upcoming weeks. If you have any recommendation for me, please feel free to send me an email.

What is the importance of a PhD degree for me?

As I was mentioned in my previous blog post, I started to my Ph.D. study. Actually, what the world expects from the Ph.D. students is that they should invent something in their own research area. Actually, it is not easy as writing these lines.. It may never happen, it is a total uncertainty.

In the recent five years, I contributed to various Data Science products at Vodafone Turkey R&D. They were all very important projects having own characteristics. But we were not inventing something new there; we were just applying what the prominent data scientists already show. Mostly, their approach had good results, but sometimes, like in Louvain algorithm in community detection, it didn’t work and we had to implement our own algorithm. As a result, we published our novel algorithm in one of the most prestigious conferences of Data Mining.

After leaving Vodafone and starting to my Ph.D., I feel myself alone but on the other hand, I feel stronger because I have my own freedom to determine to my focus area. During these days, I read many papers in Data Science, and I observed what the award winner Ph.D. students did in their dissertation around the world.

Here, I discovered that one of the paths could be:
1- Focus more on theoretical part Machine Learning
2- Observe the important research areas in Knowledge Discovery
3- Create a big data environment because less data means more bias
4- Apply Machine Learning techniques to on a real-world problem, and compare its results with existing approaches.
5- Discuss the results with the subject matter of experts, make optimizations and keep finding the best result

To be honest, I am not looking for a strategy to be graduated. I was already knowing while I was doing my Master and my work at Vodafone that Machine Learning brings great new opportunities for almost every domain. But I was unable to deep dive into it, for example in Neural Networks due to the fact that I was not able to spare time to learn such a complex field of study.

Now, I am extremely motivated that I have time to learn but as fast as possible…

Finally, this is a great illustration of what Ph.D. means for the world and for the people. Inventing something new does not mean that you are the most powerful man in the world. You only succeeded as the size of a single bit in a world of huge matrices.

http://matt.might.net/articles/phd-school-in-pictures/

And I think there are more dimensions in the world; rather than having two dimensions like the illustration below and each dimension has an impact on another. Poverty, wars, modern arts, space, family, health, beliefs, voluntary works, kindness, ethics…etc. So being a good person at all is more important than anything else.

 

Finalizing my Vodafone experience with Word2Vecs

In our latest project at Vodafone R&D, I was working under the supervision of Istanbul Technical University professor Gulsen Eryigit. All of the academical community in Turkey believe that Prof. Eryigit is the most important professor in the domain of Natural Language Processing based on Turkish Language. I had the chance of studying with Prof. Eryigit. Within a month, I implemented a predictive model based on Word2Vecs using Python language and Gensim framework.

The Word2Vec approach has been one of the trend topics of Natural Language Processing since 2014. In this technique, a neural network-based model is trained with a large vocabulary in order to identify the words as multi-dimensional vectors. Within a month, I implemented the predictive model, and then we calculated the score of our model using the Mean Average Precision technique, which is a well-known approach when there are ordered results (the ranking of outputs are important) Now we are following the existing research in Semeval 2017 task 3.

It is a great pleasure to finalize my working experience at Vodafone that lasted for 5 years with such a meaningful project!

 

 

Implementation of a Chatbot

Nowadays, chatbots -namely, conversational agents- have become so popular for companies; almost every company has an ongoing chatbot project! In general, the companies prefer getting a cloud-based (SaaS) chatbot in order to quickly go live. However, when the conversation requires a domain-specific knowledge, it becomes inefficient to use a generic chatbot. In this situation, a custom in-house solution seems a better option.

As an example, at Vodafone, the challenge was to give relevant information about Telecom specific services in the Turkish language. Our in-house product is now capable of making a conversation in Vodafone terminology! Vodafone subscribers are able to get information about their tariff or take actions such as purchasing an add-on, changing tariff, etc.

Now, I contribute to this product to make it smarter and self-learner and actually, it is a great experience for me!