Back to Search Start Over

Social media data sensitivity and privacy scanning an experimental analysis with hadoop

Authors :
Ashish Lokhande
Sanjay Bansal
Source :
2017 International Conference on Information, Communication, Instrumentation and Control (ICICIC).
Publication Year :
2017
Publisher :
IEEE, 2017.

Abstract

Now in these days the social network has becomes a daily habit for all. Most of the young and teenager are consuming their time on social media. Due to frequent reachability of users the different marketing companies are also usages this platform for publishing advertises. But not only legitimate users are available in this platform, sometimes this platform is also used for abusing someone or harshen someone. Therefore, it is need to identify the sensitive contents on the social media platforms before publishing the contents. A number of different kinds of approaches are available for scanning the contents, but all these techniques are much time-consuming. Therefore, these techniques are not directly used with the social networks. In order to find an efficient technique an effort is presented in this work. The proposed technique is an enhancement over the traditional finger print scan method for sensitive content evaluation. The proposed technique incorporates the NLP (natural language processing) parsers for identifying the sensitive features. The sensitive features are considered here as the noun words in any twit, because in most of the cases the identity of person or places are used for misguiding the social network users. Additionally, in place of linear search technique, a random index scan method is introduced for improving the time consumption of the traditional approaches. Because this technique produces the results equal as the linear search in worst case. The proposed technique is evaluated over the twitter data using the Hadoop, Strom and twitter API implementation. After the successfully implementation the technique is compared with the traditional available technique over the time and space complexity. The experimental results show the performance in terms of time requirement is three times efficient than the traditional approach of sensitivity scan.

Details

Database :
OpenAIRE
Journal :
2017 International Conference on Information, Communication, Instrumentation and Control (ICICIC)
Accession number :
edsair.doi...........15ae9356c825c5991976367b847be544