In our latest project at Vodafone R&D, I was working under the supervision of Istanbul Technical University professor Gulsen Eryigit. All of the academical community in Turkey believe that Prof. Eryigit is the most important professor in the domain of Natural Language Processing based on Turkish Language. I had the chance of studying with Prof. Eryigit. Within a month, I implemented a predictive model based on Word2Vecs using Python language and Gensim framework.
The Word2Vec approach has been one of the trend topics of Natural Language Processing since 2014. In this technique, a neural network-based model is trained with a large vocabulary in order to identify the words as multi-dimensional vectors. Within a month, I implemented the predictive model, and then we calculated the score of our model using the Mean Average Precision technique, which is a well-known approach when there are ordered results (the ranking of outputs are important) Now we are following the existing research in Semeval 2017 task 3.
It is a great pleasure to finalize my working experience at Vodafone that lasted for 5 years with such a meaningful project!
5 months ago, I wrote a blog post here that we have decided to move to Milan for the next 4-5 years. Finally, I am enrolled to Politecnico di Milano for the Ph.D. program in Computer Science and Engineering. It will be a great experience to join to Politecnico di Milano, which is in the 49th ranking in worldwide. As a result, I am leaving my current job at Vodafone. Actually, I really learned and applied various things in Vodafone. Thank you to my all colleagues there for their friendship.
Recently, I read the Lean Startup book written by Eric Ries and I learned a lot of about how to build a technology startup today.
Then, I extracted the fundamental ideas of Lean Startup philosophy. Below, I already answered the first question. I am waiting for the answers and comments from you…
1- What is a Build-Measure-Learn Cycle?
Build: Initially, you need to start by building a minimum viable product (MVP). It is the essential part of learning whether the main idea of a startup is promising or not. You do not write any line of code for building an MVP. You will create the most basic environment to measure the behavior of your potential customers.
Measure: In order to understand the interest of your customers, you need to use some analytical tools, the cohort analysis is the most important component of measuring the user behavior. But the most important one is to directly contact the customers who experienced your MVP. You need to organize focus sessions with those people. If you don’t have any customer to measure, then try Google Ads to have a limited user base.
Learn: Now it is time to decide to pivot or persevere. Pivoting means as transforming your initial idea into what customers want. If your customers find your MVP valuable and they commit to using it, then you don’t need to pivot, which means you will persevere your MVP by adding additional features to your first product.
2- What is the meaning of Leaps-Of-Faith hypothesis?
3- How do you decide whether a startup is on the right way or not?
4- What is the difference between successful and unsuccessful entrepreneurs?
5- How do you measure a value of a network?
People most generally make a mistake by comparing MapReduce with Spark.
Actually, MapReduce is a programming paradigm, so we cannot compare MapReduce with Spark. But we can compare how Hadoop uses MapReduce and Spark uses MapReduce.
In Hadoop MapReduce, each job has one Map and one Reduce phase; but in Spark MapReduce, the Map and Reduce phases can be made together. Secondly, while in Hadoop MapReduce the output of jobs is written as a file, Spark writes them to the memory. As a result, it accelerates the overall execution time of the master job.
Today, I joined to the Meetup event Machine Learning in Milan, hosted by Marcosca in Via Bligny. It was the first meeting of this group, so there were a few people knowing each other. It had a great opportunity to meet with people having different backgrounds. The purpose of this group is to create a Machine Learning community in Milan by organizing meetups and events and then accelerate this ecosystem with startups and investors. I feel lucky since this event is realized when I was in Milan to do my personal issues. I thank everybody who contributed to this organization. The next meetup will be in 27 October. You may find the details from the Meetup app.
Nowadays, chatbots -namely, conversational agents- have become so popular for companies; almost every company has an ongoing chatbot project! In general, the companies prefer getting a cloud-based (SaaS) chatbot in order to quickly go live. However, when the conversation requires a domain-specific knowledge, it becomes inefficient to use a generic chatbot. In this situation, a custom in-house solution seems a better option.
As an example, at Vodafone, the challenge was to give relevant information about Telecom specific services in the Turkish language. Our in-house product is now capable of making a conversation in Vodafone terminology! Vodafone subscribers are able to get information about their tariff or take actions such as purchasing an add-on, changing tariff, etc.
Now, I contribute to this product to make it smarter and self-learner and actually, it is a great experience for me!
In the last 2 years, we were planning to live in a foreign country and we were trying to catch the best opportunity for both of me and my wife, and we had various alternatives including Toronto, Madrid, Berlin, and Milan. Finally, we have decided to move in Italy since the lifestyle of Italians seems very close to the Turkish people. Moreover, we speak French and it is a good opportunity to learn Italian quickly. I think there is no need to talk about how much delicious is Italian kitchen, with its pizzas, kinds of pasta and more…
As of 1st September, my wife Seda is going to begin to her Ph.D. study at Bocconi University, in Milan. Since she also had a full scholarship, it will be beneficial to cover our living expenses. On the other hand, I should leave my current job at Vodafone and seek new opportunities in Milan. Currently, I applied to Politecnico Milano for the Ph.D. degree in Computer Science and I will apply also to Milan Bicocca University. By the way, I started to learn Italian since I have a huge interest in the Italian language. I already completed one course in Italian Culture Institute, and I am now in the second course. Maybe, my next blog post could be in Italian, since in a very short time period of time, I got a great progress in Italian. And also I have started to listen to Italian songs as well, and I become happy when I understand the meaning of the lyrics…
I will continuously share my experiences from here…
Today, the digital companies are working extensively to catch their potential customers on the web. Actually, the companies and consulting firms build their digital marketing strategies separately for each channel, in social media, paid media and their owned media channels.
At this point, there are many available Marketing tools and the some of the most prominent ones are Tealium and Adobe Marketing Cloud tools. Moreover, Data Management Platforms(DMP) enables more efficient targeting with the help of 3dr party data. Today, the trend is to upload the customer data to DMP platforms in order to get a revenue from the commercial campaigns of other companies. For example, if you want to target the mothers, who had a child in the last 3 months, you are able to target this specific segment through DMP platforms even though you don’t have any data related with mothers.
You may find current actors in DMP business, such as Oracle Blukai product from this link.
Custom Audience is a well-known feature of Facebook to target your ad campaigns using your own target list.
Even though Google limits the advertisers by only uploading email information, Facebook gives additional options such as mobile phone numbers, which is a very substantial resource for many industries including Telco companies.
Facebook gives an excellent opportunity for advertisers to dynamically set Custom Audience targets by exposing its Custom Audience APIs in many languages. Here, you may find Java SDK https://github.com/facebook/facebook-java-ads-sdk
If you need to update the user list of your Custom Audiences dynamically, you need to integrate your working environment through these APIs. Here, you will need to grant the access using the parameters below.
- Facebook App Secret Code
- Facebook Ad Account ID
- Facebook Access Token having ads management permissions
Finally, you will be able to target your campaigns to the right audience in real time. If you need guidance for integration issues, feel free to contact me.
When you make some analysis on Hadoop, Apache Pig is one of the simplest ways to get and transform the data. Another alternative is Apache Hive, which seems more easy for people who already know SQL. Well, I used both, but writing scripts with Pig are better since you become able to see your data in each step of the codes. Moreover, it is more human-readable than SQL style code blocks (nested SQL, etc)
In the last two years, I wrote many Pig scripts. I would like to give some tips about Pig Scripting.
- Use DEFINE functions to separate the file loading functions into a different Pig, which can be named as Loader.pig
- When Pig does not provide the desired functionalities, write your own User Defined Functions with Java. For example, if you need to compare the object values, or if you want to use a sorting algorithm, then you may use your own Java codes and make them call from Pig script. This feature totally increases the flexibility of Apache Pig. When you enter the Java UDF world, then you can do everything with the collaboration of Java and Pig. Here, the main challenge is to track the objects called in UDF but you can develop yourself by making lots of trials.
- Parameter Substitution is a prominent feature of Pig. With @declare annotations, it is possible to define custom variables. However, the dynamic value assignment is a challenge.
- Before running in pig mode, complete your tests with the pig -x local mode with a small amount of data since it becomes inefficient to wait and see the script results in pig mode.