職務履歴

Academics & Job history

日本語に切り替えるためには、下のボタンを押してください

At the date of 29th September 2018

To CV page

概要

Summary

Academic experience

Osaka University, School of Foreign Studies

Major: Persian language and culture in Iran and Afghanistan
Degree: B.A. in languages and cultural studies
1 April 2008 ~ 31 March 2012
Osaka, Japan
Knowledge and skills
  • Persian language
  • Culture and custom in an area of Iran and Afganistan
  • Syntactic and semantic linguistics in the Indo-European language family
  • Basic knowledge of taxonomy for folklore
My study

Supervisor: Shin Takehara

The main theme of my study is "Linguistics feature of Persian for non-native speakers". For leaning a language efficiently, it is necessary to understand the difference of linguistics aspect. My study focused on a case when a Japanese native speaker learns Persian. To investigate differences between Japanese and Persian from linguistics aspect, I used methods of computational linguistics. I constructed syntactic-trees from parallel text between Japanese and Persian, and I compared which elements are added or deleted between 2 languages.

Other activities

Nara Institute of Science and Technology, Graduate School of Information Science

Major: Natural language processing
Degree: M.E.
1 April 2012 ~ 31 March 2014
Nara, Japan
Knowledge and skills
  • Basic mathematics for machine learning
  • Technologies for Natural language processing, including machine learning
  • Programming skill(Python, Scala)
My study

Supervisor: Yuji Matsumoto

An analysis of narratives has been conducted in folklore fields. It enables us to discover customs or manner when narratives are told among populaces. 'Thompson Motif Labeling'(TMI) is generally used as labels to analyze narratives. TMI enables us to express narratives in taxonomy way, therefore we are able to analyze them efficiently. For this reason, TMI is generally used, however, it is rare to construct new labeled narrative corpus because the manually labelling is time-consuming work. To solve this issue, some label inferring methods with supervised learning were proposed. Meanwhile, these methods are difficult to apply to existing unlabeled narratives due to language gap. Folktales or myths are usually written in a language in which they are told among populaces, hence a language gap causes frequently between labeled narrative corpus and unlabeled narrative corpus. This study was performed to solve the language gap.

Keyword: Folklore, Document classification, Multi-label classification, Natural language processing, Feautre selection

Other activities

Work experience

Drecom Inc.

Position: Data scientist
1 April 2014 ~ 31 March 2015
Tokyo, Japan
Enterprise information
  • Domain of business: Information and Technology
  • Number of employee: 407 at the date of 31.3.2018
  • Main contents of business: Video game provider on smallphone device, Provider of online advertisement platform
My Mission

To provide games with good quality, it's necessary to understand users' behavior. Social game providers are recording a huge number of data of users' behavior such as a timestamp of login, transition from one action into another action or XY axis data of touching screen during a game. These kinds of data tend to be massive, thus it's impossible to manage by commonly used software such as Excel. Especially, I worked for these projects such as

  • Construction of auto-reporting system to monitor users behavior
  • Factor analysis to let user to keep playing a video game for long-term
  • Construction of internal-tool to monitor competitors' bidding in online advertisement

Insight Tech Ltd.

Position: Data scientist
14 February 2015 ~ ongoing
Tokyo, Japan
Enterprise information
  • Domain of business: Information and Technology
  • Main contents of business: Providing text analytics system for Japanese language, Running opinion platform named "FumanKaitori". For detailed information of Insight Tech, please see this article.
My Mission
  • Development of text analytics model and deployment the model as Web application. This model is introduced as web application, which is named "ITAS". Press Release(Jp only)
  • Administrator of internal server for development of machine-learning model and system
  • Text mining to find business useful tendency or insight based on massive text posted into "Fuman Kaitori"
  • Hiring internship students and mentor of them.
  • Construction of partnership with Institutes or Universities, which is for latest research in computer science field.

プロジェクト履歴

Projects

Project timeline

Osaka University

Research assistant of constructing folklore database

April 2010~March 2011

A research project has been conducted to construct folklore database based on narratives collected in Iran.

During works to input data, I made scripts to make this work efficiently.

  • Keyword: XML, Folklore, Visual Basic

Nara Institute of Science and Technology

Translation from Japanese text into Japanese sign language

June 2012~March 2013

We formed a research project team for translation from Japanese text into Japanese sign language which is shown by computer graphic.

My mission was to translate Japanese text into symbols to express meaning for sign-language users. In sign language world, there are special symbols to note the meaning of sentences because a deaf person uses gestures of hands and movements of the face. If we translate text into these special symbols, it's possible to make computer-graphic based video. To realize this goal, I used rule-based way based on syntactic tree and predicate-argument analysis. We reported this work in the following paper(JP only).

This project was supported by a research funding program named "NAIST Creative and International Competitiveness Project (CICP2012)"

  • Project member: 4
  • My role: Project leader
  • Keyword: Predicate-argument analysis, Dependency parser

Automatically error-correction of English text

February 2013~June 2013

We formed a research project team for CoNLL Shared task 2013 which aims to construct system for English grammatical error.

I did a part of constructing a model to correct "Noun number" error which is the mistake of using the singular form for a plural noun, and vice versa.

We reported this work in a paper.

  • Project member: 8
  • Keyword: Grammatical error correction, Binary classification, Stanford parser

Drecom Inc.

Factor analysis to let users keep using video game

April 2014~June 2014

Drecom Inc. provides English learning application for Japanese native speaker named "Eipontan". I investigate factors to let users keep useing application and keep learning.

  • Project member: 3
  • My role: Main person of investigations
  • Keyword: MySQL, Hadoop, R(on Rstudio)

Development of internal web application

June 2014~August 2014

I developed internal web application which keeps and shows information about competitors' bidding history in online-advertisement platform

Until this web application, people in a team is recording past bidding hisotry with Excel, which causes huge number of excel files. They switched into this web appliaction.

  • Project member: 3
  • My role: Designing & implementation of web application
  • Keyword: R, Rstudio, Shiny

Investigation of users' behavior & development of report mail system

August 2014~January 2015

Drecom provides browser based video game named "Onmyo-ji". I investigated how specific event of video game give affection to users' behavior.

  • Project member: 4
  • My role: Main person of investigation
  • Keyword: MySQL, Hadoop, R, Rmd report

Insight Tech Ltd.

Development of batch system of aggregation

February 2015~May 2015

Development of business dashboard to make visualization of data on opinion platform "Fuman Kaitori".

  • Project member: 5
  • My role: I wrote wireframe of dashboard and made batch aggregation system
  • Keyword: Wireframe, Python, Jenkins

Partnership with Kurohashi & Kawahara Lab in Kyoto university

May 2015

I realize the partnership to realize text analytics system with latest syntactic parsing technology of Japanese.

Administrator of internal server

June 2015

I introduced internal servers (4 UNIX severs, including one GPU server).

I made an environment for machine-learning modeling, for continuous integration tool batch job and for distributed computing system.

  • Keyword: Unix, Ubuntu, Jenkins, Spark

Text clustering system

June 2015 ~ November 2015

I developed a system to aggregate documents by clustering algorithm.

I investigated some clustering algorithms with parameter-tuning.

I published a paper the result of this investigation. a paper(JP only)

  • Project member: 4
  • My role: finding out the best clustering algorithm for given text.
  • Keyword: LDA, k-means, syntactic-parsing, bayesian parameter tuning

Published dataset for NLP research purpose

October 2015

I proposed to publish dataset (corpus) for NLP research purpose. This corpus is made by text posted into opinion platform "Fuman Kaitori".

The dataset is now available only for institutes from NII in Japan. Link(JP only)

I published a paper which explains specification of the dataset.

Development of a system for information retrieval

February 2016~July 2016

A project to find "useful opinion" and "userful keyword" from plain text.

I developed a system to find named-entity with Wikification, which is a task to find keyword refering an article of Wikipedia.

As a result of some investigation, I used LSTM to run disambiguation of wikification task, where sequence of candidate wikipedia article is input of LSTM and most plausible sequence is output.

  • Project member: 3
  • My role: system development to run wikification on plaine text.
  • Keyword: Python, Redis, Wikification, LSTM, word2vec

A model to extract "negative phrase" (Intern mentor)

August 2016~September 2016

I was a mentor to help a intern student to develop a model.

The model extracts "negative phrase" from plain text. The output of this system is useful for sentiment classification tasks.

  • Project member: 2
  • My role: A mentor of intern student
  • Keyword: Mentor, Python, language model

Put a information retrival system into production

October 2016~December 2016

I did refactoring of source code at the this project.

The algorithm of system for the project has a lot of system dependencies which causes difficulty to run this system in stable production.

Thus, I introduced docker to make the system environment as a package.

  • Project member: 1
  • My role: Refactoring of the source code, system design to run it in stable
  • Keyword: Python, Docker

Published NLP resource

January 2017

I proposed a plan to publish NLP resource, which was considered gain recognition business of Insight Tech.

The resource is now available at NII. Link(JP only)

  • Project member: 2
  • My role: Planning, making draft of contract and regal policy of the resource, making the resource
  • Keyword: NLP resource

Internal web application for information retrival model

January 2017~February 2017

I developed a web application to call the model.

With the web application, every person in Insight Tech is availe to use this model. Input/Output is Excel.

  • Project member: 1
  • My role: Refactoring of source code, development of web application
  • Keyword: Web application, Python, Dango, Apache

Clustering of opinion tuple

February 2017~April 2017

I developed a clustering model to make clustering of a lot of "opinion-tuple". The opinion-tuple is output of this system, where the tuple is consisted with (target-of-predicate, relation, predicate)

I run some experiments to compare best model to convert the opinion-tuple into dense-vector. Finally, I adopoted LSTM auto-encoder.

I also developed to make visualization the output of clustering. Here is one example of visualization output.

  • Project member: 1
  • My role: System design, investigation/selection of algorithms, experiments to select best algorithm, implementation of the algorithm
  • Keyword: Python, LSTM, CNN, word2vec, k-means, HDBSCAN, Javascript, D3.js

Internal web application for information retrival model

April 2017~October 2017

I developed an internal web appication to call the information retrival model and clustering model.

This web application provides NLP solutions and visualization based on it such as word cloud or clustering-tree. And this web application provides Excel file which contains the result of NLP processes and easy to handle for non NLP professional person.

The NLP processes takes long time usually, thus it's difficult to make it in responsible. So, I adopted Job-queue syle for the NLP process.

  • Project member: 1
  • My role: System design, making wireframe, implementation of Web application
  • Keyword: Python, Django, Postgresql, Javascript, D3.js, Jquery, bootstrap, Apache, Docker

Sentiment classification system / burst-topic detection system (Intern mentor)

August 2017~October 2017

I hired 2 intern studensts. I designed system and made plans for developments during their internship.

One intern student developed "clause-based" sentiment classification model instead of "document" base. This approach is useuful when length of document is too long. We report this project in a paper.

Another intern student developed a model to detect burst-topic in opinion platform "Fuman Kaitori". "Burst" is one phenomenon in SNS(such as Twitter) when a topic is suddenly appeared.

As a mentor, I help them understand background of tasks or help them write source code in good quality.

  • Keyword: Mentor, Python, sentimen classification, CRF, burst detection

An internal web application of the sentiment classification model

November 2017~December 2017

I did source code refactoring of the sentiment classification model which an intern student developed. And I put it inside an internal web application.

I implemented the web application as RestAPI because the other web application calls it.

  • Project member: 1
  • My role: System design, making wireframe, implementation of Web application
  • Keyword: Python, Flask, Postgresql, Docker

Document similarity model (Intern mentor)

January 2018~February 2018

As a part of training program of an university in close relationship, I helped an B4 students to develop a model to compute document similarity.

Migration of an internal system into cloud computing

April 2018~July 2018

I put NLP analytics system which has beed working at an internal server into cloud computing environment(AWS).

It was difficul anymore to scale these system by internal servers.

These NLP analytics system tend to be high computing costs, long time execution. Thus, I consider best solution to run these system in reasonale with AWS component.

Finally, I realized that with AWS Batch and associated components to it.

  • Project member: 1
  • My role: System design, implementation
  • Keyword: Python, Docker, API Gateway, AWS lambda, AWS Batch, S3

Sentiment classification system / Search engine with linguistics aspect (Intern mentor)

August 2018~September 2018

I hired 2 intern students(master candidate of computer science).

One student tackled to improve the previous sentiment classification model with deep neural network.

Another student developed search engine which can search text "modality" aspect of text. The modality is various kinds of linguistics expression depending on speakers "demand", "prediction" or "prohibit".

I help them uderstand task background, help them understand algorithm of deep neural network or linguistics knowledge.

  • Keyword: Mentor, Python, sentiment classification, LSTM, CNN, modality