246 results
Search Results
2. Bidirectional IndRNN malicious webpages detection algorithm based on convolutional neural network and attention mechanism.
- Author
-
Wang, Huan-Huan, Tian, Sheng-Wei, Yu, Long, Wang, Xian-Xian, Qi, Qing-Shan, and Chen, Ji-Hong
- Subjects
ARTIFICIAL neural networks ,UNIFORM Resource Locators ,RECURRENT neural networks ,WEBSITES ,ALGORITHMS - Abstract
A convolutional neural network combined with attention mechanism and a parallel joint algorithm model (CATTB) of bidirectional independent recurrent neural network are proposed. The algorithm extracts the relocation feature and the "texture fingerprint" feature for expressing the similarity of the URL (Uniform Resource Locator) binary file content of the malicious web page, and uses the word vector tool word2vec to train the URL word vector feature and extract the URL static vocabulary feature. CNN (Convolutional Neural Network) is used to extract deep local features. Secondly, Attention mechanism adjusts weight and BiIndRNN (Bidirectional Independently Recurrent Neural Network) to extract global features. Finally, softmax is used for classification. This paper extracts more comprehensive features from different angles and using different methods. The experimental results show that the test results are higher than other researchers, and compared with other algorithms, the proposed CATTB algorithm improves the accuracy of malicious web page detection. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
3. DDIML: Explainable detection model for drive-by-download attacks.
- Author
-
Liu, Xiaole, Huang, Cheng, and Fang, Yong
- Subjects
FEATURE extraction ,DEEP learning ,MACHINE learning ,RANDOM forest algorithms ,MALWARE ,WEBSITES - Abstract
A drive-by download is a method of hackers planting the Web Trojan, which exploits browser vulnerabilities to execute malicious software. Because people usually access web pages with various browsers daily, drive-by downloads have become one of the most common threats in recent years. Most previous studies utilize the abstract syntax tree(AST) with deep learning methods to detect such attacks, which achieved high accuracy but are time-consuming and challenging to explain. Also, some methods use dynamic analysis, which needs a specific environment and is time-consuming with the complex operation. In order to solve these problems, the paper proposes DDIML, an explainable machine learning model based on novel features with static analysis. These features are extracted from five aspects: code obfuscation, URL redirection, special behaviors, encoding characters, and CSS attributes. The most popular machine learning algorithm, Random forest, is applied for building the classifier detection model. In addition, we use both local and global explanations to improve the model and prove that the proposed model could be trusted. The Experimental results show that our proposed model can efficiently detect drive-by downloads with a detection precision of 0.983 and a recall of 0.980. The average detection time for each sample is only 16.07ms in total. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
4. Research on data retrieval and analysis system based on Baidu reptile technology in big data era.
- Author
-
Jin, Jiangang, Elhoseny, Mohamed, and Yuan, X.
- Subjects
INFORMATION retrieval ,SYSTEM analysis ,BIG data ,WEBSITES ,DATA analysis ,SCALABILITY ,INFORMATION networks ,TEMPORAL databases - Abstract
With the rapid development of the Internet, the current Web has become the main platform for people to publish and retrieve information. How to quickly and accurately find the information required by users in a large amount of network information resources has become an urgent need of the people. Web crawlers are research fields that appear to meet this demand. Based on this, the paper designs and implements a distributed web crawler system based on the existing research work, and its goal is to provide high quality data support for the network public opinion system. The web crawler system designed and implemented in this paper solves the problems of low efficiency, poor scalability and low automation of single-machine crawlers, which improves the speed of webpage collection and data extraction precision and expands the scale of webpage collection. At the end of the article, the system related interface screenshots and test results are displayed. It can be seen from the test results that the crawler system can effectively collect dynamic web pages, and the result of automatic extraction of web pages has high precision, and also realizes the entire crawling system. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
5. Trends in web data extraction using machine learning.
- Author
-
Patnaik, Sudhir Kumar and Narendra Babu, C.
- Subjects
DATA extraction ,MACHINE learning ,WEBSITES ,ELECTRONIC data processing ,INTELLIGENT buildings - Abstract
Web data extraction has seen significant development in the last decade since its inception in the early nineties. It has evolved from a simple manual way of extracting data from web page and documents to automated extraction to an intelligent extraction using machine learning algorithms, tools and techniques. Data extraction is one of the key components of end-to-end life cycle in web data extraction process that includes navigation, extraction, data enrichment and visualization. This paper presents the journey of web data extraction over the last many years highlighting evolution of tools, techniques, frameworks and algorithms for building intelligent web data extraction systems. The paper also throws light into challenges, opportunities for future research and emerging trends over the years in web data extraction with specific focus on machine learning techniques. Both traditional and machine learning approaches to manual and automated web data extraction are experimented and results published with few use cases demonstrating the challenges in web data extraction in the event of changes in the website layout. This paper introduces novel ideas such as self-healing capability in web data extraction and proactive error detection in the event of changes in website layout as an area of future research. This unique perspective will help readers to get deeper insights in to the present and future of web data extraction. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
6. Software reusability metrics prediction by using evolutionary algorithms: The interactive mobile learning application RozGaar.
- Author
-
Padhy, Neelamadhab, Satapathy, Suresh Chandra, Mohanty, J.R., and Panigrahi, Rasmita
- Subjects
COMPUTER software ,EVOLUTIONARY algorithms ,WEBSITES ,MACHINE learning - Abstract
Considering object oriented program based software metrics (cohesion, coupling and complexity) and their significance to characterize software quality, particularly software component reusability, we have considered six important CK matrices. The predominant reason behind using the measurement technique is the individual relationship with the design aspect and fault-proneness or aging-proneness. The key objective of this paper is to generate employment opening to thousands of people who have different skillsets and furthermore to provide hassle-free services by RozGaar service providers to customers with the help of machine learning techniques. In the current century's rapid growth of modernization and automation, manual labor is reduced which gives rise to unemployment at mass. If we need technicians, workers, plumbers or drivers who work on daily wages, it is quite difficult to find one in our locality without having any contact references and knowing the quality of the work they provide. This paper helps in filling the gap between the various customers and the service providers. We aim to introduce this paper as an ocean of opportunities for all where people can get jobs on a daily basis and can earn money for their skills. The used application is a dual-platform application that runs on Android devices and on Internet as a website, promising you to provide unmatched services of daily work. To achieve the goal, we used the novel software prediction model, evolutionary algorithms such as decision tree, Rough Set, and Logistic Regression algorithms, to predict software reusability. [ABSTRACT FROM AUTHOR]
- Published
- 2018
- Full Text
- View/download PDF
7. Trust-based recommendations for documents.
- Author
-
Hess, Claudia and Schlieder, Christoph
- Subjects
SOCIAL networks ,RECOMMENDER systems ,WEB personalization ,ELECTRONIC records ,ONLINE social networks ,WEBSITES - Abstract
Recommendation techniques that analyze social trust networks attracted much attention in the last few years. They recommend such items that are appreciated by trusted friends. In this paper, we explore how to use trust information for generating personalized document recommendations such as for scientific papers or for webpages. The basic idea is to jointly analyze a trust network between readers who review the documents and the reference network between the documents. We develop trust-enhanced visibility measures for measuring the quality and the importance of documents and evaluate them in simulation studies. [ABSTRACT FROM AUTHOR]
- Published
- 2008
- Full Text
- View/download PDF
8. Gold-level open access at the Semantic Web journal.
- Author
-
Janowicz, Krzysztof and Hitzler, Pascal
- Subjects
SEMANTIC Web ,BIBLIOGRAPHY ,PERIODICAL subscriptions ,WEBSITES - Published
- 2020
- Full Text
- View/download PDF
9. Translation of news reports related to COVID-19 of Japanese Linguistics based on page link mining.
- Author
-
Liu, Xiaohua and Li, Xiaolong
- Subjects
COVID-19 ,INTERNET content ,WEBSITES ,K-means clustering ,ALGORITHMS ,HYPERLINKS - Abstract
In the face of the current epidemic situation, news reports are facing the problem of higher accuracy. The speed and accuracy of public emergency news depends on the accuracy of web page links and tags clustering. An improved web page clustering method based on the combination of topic clustering and structure clustering is proposed in this paper. The algorithm takes the result of web page structure clustering as the weight factor. Combined with the web content clustering by K-means algorithm, the basic content that meets the conditions is selected. Through the improved translator of clustering algorithm, it is translated into Chinese and compared with the target content to analyze the similarity. It realized the translation aim of new crown virus epidemic related news report of Japanese Linguistics based on page link mining. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
10. Application of deep learning and BP neural network sorting algorithm in financial news network communication.
- Author
-
Jingyu, Chen, Qing, Chen, Isaeva, Ekaterina, and Rocha, Álvaro
- Subjects
TELECOMMUNICATION systems ,DEEP learning ,WEBSITES ,ALGORITHMS ,SEARCH engines ,ELECTRONIC data processing ,HABIT breaking - Abstract
The reasonable ranking of web pages is an important step in the realization of search engine technology, and plays a key role in improving the information quality of high retrieval and presentation web pages. In this paper, the authors analyze the application of deep learning and BP neural network sorting algorithm in financial news network communication. According to the historical data of users in the process of searching and browsing, we can extract the potential connection between the information of the page itself and the user's behavior habits, so as to mine the user's potential preference page. Finally, we use a variety of technologies to mix recommendation and sorting. By observing the effect of multiple training, it is found that the convergence speed of the network model is fast, the training time is short and the training effect is good. We need to fully understand the characteristics of financial news communication in the new media era, and then fully grasp people's financial news reading habits on this basis, and then put forward the innovative mode of financial news communication in the new media era on the basis of these two points. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
11. Higher-Order Rank Functions on Directed Graphs.
- Author
-
Kashiwabara, Kenji, Horie, Ikumi, and Yamaguchi, Kazunori
- Subjects
DIRECTED graphs ,WEBSITES - Abstract
We introduce a new higher-order rank function with the capability to completely discriminate non-equivalent nodes. We review the partition lattice and rank functions and situate the existing rank functions and higher-order rank functions within the formalization. We propose a new refining operator and a new rank function that are better than the existing ones in some applications. We also show that the entire topology (graph) can be reconstructed from only our higher-order ranks making it possible to compare nodes in different graphs and to update the equivalence of nodes when an edge is added. Finally, we briefly describe the use of our higher-order rank function in analyzing web pages as a possible application in different domains. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
12. Web text data mining method based on Bayesian network with fuzzy algorithms.
- Author
-
Zhao, Wei, Luo, Zeju, and Zhang, Weiping
- Subjects
DATA mining ,FUZZY mathematics ,FUZZY measure theory ,WEBSITES ,MEMBERSHIP functions (Fuzzy logic) ,FUZZY arithmetic - Abstract
With the advent of Web 3.0 era, the number and complexity of Web pages in Bayesian networks have shown an explosive growth trend. Accompanying this is the geometric growth of information contained in Web pages. Web text data in Bayesian networks usually hide rich knowledge and rules of user value. However, due to the semi-structured, real-time and discrete characteristics of Web text data, it is difficult for users to obtain the knowledge they need directly from such complex data sets. The emergence of fuzzy mathematics provides a good research idea and method for solving such problems. It can use the idea of fuzzy mathematics to analyze the practical problems in text data. Therefore, how to effectively mine the Web text data information and knowledge that users really care about from Bayesian network, and present it in a way that users can understand, it is a very popular research topic at present. In this paper, we select the text of Bayesian network: microblog data for experiments. User data model of microblog is established by using relevant knowledge of fuzzy theory. The concept of fuzzy measure is introduced to calculate the non-additive measure value under the interaction relationship between the detection indicators. Determine the membership function relationship between the detection user and the text data, calculate the integral values of Choquet integral, Sugeno integral and Wang integral of the membership function under the non-additive measure, the final valuable web text data is judged by integral value. On the basis of the above research contents, the research results of Web text mining technology and fuzzy arithmetic mathematics are combined, design and implement information acquisition and analysis for Bayesian network community. The recall rate obtained by the experimental method in this paper is as low as 4%, and tends to be more stable. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
13. Comparison of Classification Algorithms for Detection of Phishing Websites.
- Author
-
VAITKEVICIUS, Paulius and MARCINKEVICIUS, Virginijus
- Subjects
CLASSIFICATION algorithms ,UNIFORM Resource Locators ,SUPERVISED learning ,PHISHING ,COMPUTER crimes ,WEBSITES ,ALGORITHMS - Abstract
Phishing activities remain a persistent security threat,with global losses exceeding 2.7 billion USD in 2018, according to the FBI's Internet Crime Complaint Center. In literature, different generations of phishing websites detectionmethods have been observed. The oldestmethods include manual blacklisting of known phishing websites' URLs in the centralized database, but they have not been able to detect newly launched phishing websites. More recent studies have attempted to solve phishing websites detection as a supervised machine learning problem on phishing datasets, designed on features extracted from phishing websites' URLs. These studies have shown some classification algorithms performing better than others on differently designed datasets but have not distinguished the best classification algorithm for the phishing websites detection problem in general. The purpose of this research is to compare classic supervised machine learning algorithms on all publicly available phishing datasets with predefined features and to distinguish the best performing algorithm for solving the problem of phishing websites detection, regardless of a specific dataset design. Eight widely used classification algorithms were configured in Python using the Scikit Learn library and tested for classification accuracy on all publicly available phishing datasets. Later, classification algorithms were ranked by accuracy on different datasets using three different ranking techniques while testing the results for a statistically significant difference using Welch's T-Test. The comparison results are presented in this paper, showing ensembles and neural networks outperforming other classical algorithms. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
14. A New Method of Multi-Criteria Analysis for Evaluation and Decision Making by Dominant Criterion.
- Author
-
ŽIŽOVIĆ, Miodrag M., ALBIJANIĆ, Miloljub, JOVANOVIĆ, Verka, and ŽIŽOVIĆ, Mališa
- Subjects
DECISION making ,DENTAL clinics ,WEBSITES - Abstract
This paper introduces a new method for multi-criteria analyses where the failure to meet the dominant criterion of an alternative causes low values for the entire alternative. In this method, the introduction of new alternatives into the multi-criteria model does not affect the existing alternatives in the model. The new method was applied for the rating of ten websites of dental clinics in Serbia, which provide prosthetic services to tourists. The dominant criterion was the amount of information provided by the site. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
15. Online learning agents for cost-sensitive topical data acquisition from the web.
- Author
-
Naghibi, Mahdi, Anvari, Reza, Forghani, Ali, and Minaei, Behrouz
- Subjects
ONLINE education ,ACQUISITION of data ,LEARNING ability ,WEBSITES ,ELECTRONIC data processing - Abstract
Access to one of the richest data sources in the world, the web, is not possible without cost. Often, this cost is not taken into account in data acquisition processes. In this paper, we introduce the Learning Agents (LA) method for automatic topical data acquisition from the web with minimum bandwidth usage and the lowest cost. The proposed LA method uses online learning topical crawlers. The online learning capability makes the LA able to dynamically adapt to the properties of web pages during the crawling process of the target topic, and learn an effective combination of a set of link scoring criteria for that topic. That way, the LA resolves the challenge in the mechanism of combining the outputs of different criteria for computing the value of following a link, in the formerly approaches, and increases the efficiency of the crawlers. A version of the LA method is implemented that uses a collection of topical content analyzers for scoring the links. The learning ability in the implemented LA resolves the challenge of the unclear appropriate size of link contexts for pages of different topics. Using standard metrics in empirical evaluation indicates that when non-learning methods show inefficiency, the learning capability of LA significantly increases the efficiency of topical crawling, and achieves the state of the art results. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
16. Webportal vs google for finding government information on the web: From a website-centric approach to a web ecology perspective.
- Author
-
Henman, Paul and Graham, Tim
- Subjects
GOVERNMENT information ,WEBSITES ,MUNICIPAL services ,INTERNET in public administration - Abstract
Webportals – websites that operate as front doors or guides into government on the web – are central to government web strategy and presence. However, little is known about their success in enabling people to quickly and accurately access public sector information and services. In these days of Google and generic web search engines, government webportals are not the only way to find government on the web. This paper argues that an effective evaluation of government webportals requires shifting from a website perspective to a whole-of-web (or web ecology) perspective. This perspective is illuminated by an online quasi-experiment of the effectiveness of the British government's webportal, www.gov.uk. Participants' performance in using the webportal to find information about public services were compared with those using commercial web search tools (such as Google). There was mixed evidence that the portal provided greater accuracy in finding public service information, but no evidence for greater speeds. The findings suggest that government web strategy focus less on creating large webportals and more on small functionally-defined web units that offer enhanced opportunities for commercial search engine discoverability and flexibility for change. [ABSTRACT FROM AUTHOR]
- Published
- 2018
- Full Text
- View/download PDF
17. Citation behaviour of information science students II: Postgraduate students.
- Author
-
Clarke, Maria Elizabeth and Oppenheim, Charles
- Subjects
ACADEMIC dissertations ,WEBSITES ,STUDENT attitudes ,BIBLIOGRAPHY ,BIBLIOGRAPHICAL citations - Abstract
This paper reports the results of a study of student citation behaviours in the Department of Information Science, Loughborough University. The research methods were citation analysis of student dissertation bibliographies from 1998 to 2003, a survey of student's attitudes to citation behaviour and a test of student citation accuracy. The results of the investigation showed the majority of citations were to journals (32.6%, books (30.0%), and websites (24.0%). Websites were found to be cited more than books for the first time in 2002 and again in 2003. The number of citations to electronic formats increased over time. A highly significant difference was found in the correlation between the topic students studied and the numbers of citations to electronic journals. The overall percentage of citation errors was 24.9% and the majority of bibliographies (80% were found to contain at least one error. Many students (56.9% stated they did not feel confident when citing electronic materials. The results suggest that students have many reasons to cite, and that these are generally similar to those of scholarly authors. [ABSTRACT FROM AUTHOR]
- Published
- 2006
- Full Text
- View/download PDF
18. URLCam: Toolkit for malicious URL analysis and modeling.
- Author
-
Ayub, Mohammed, El-Alfy, El-Sayed M., Thampi, Sabu M., and Trajkovic, Ljiljana
- Subjects
CYBERTERRORISM ,FEATURE selection ,MACHINE learning ,INTERNET security ,WEBSITES - Abstract
The World-Wide Web technology has become an indispensable part in human's life for almost all activities. On the other hand, the trend of cyberattacks is on the rise in today's modern Web-driven world. Therefore, effective countermeasures for the analysis and detection of malicious websites is crucial to combat the rising threats to the cyber world security. In this paper, we systematically reviewed the state-of-the-art techniques and identified a total of about 230 features of malicious websites, which are classified as internal and external features. Moreover, we developed a toolkit for the analysis and modeling of malicious websites. The toolkit has implemented several types of feature extraction methods and machine learning algorithms, which can be used to analyze and compare different approaches to detect malicious URLs. Moreover, the toolkit incorporates several other options such as feature selection and imbalanced learning with flexibility to be extended to include more functionality and generalization capabilities. Moreover, some use cases are demonstrated for different datasets. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
19. Analytic Hierarchy Process for website evaluation.
- Author
-
Kabassi, Katerina
- Subjects
WEBSITES ,ANALYTIC hierarchy process ,COMPUTER software - Abstract
The evaluation of a website is a complex procedure and, therefore, is often omitted in the website life-cycle despite its importance. Taking into account the suitability of Analytic Hierarchy Process (AHP) for evaluating software and the wide usage of the theory in the evaluation of websites, we make a state of the art on the evaluation experiments conducted using the AHP theory. More specifically, the paper presents evaluation experiments that use AHP, fuzzy AHP or their combination with other methods for website evaluation. The paper focuses on websites of different domains, e.g. e-government, e-banking, e-news, e-health and e-education, summarizes and presents the steps that need to be taken for the application of AHP irrelevant to the domain of the websites. The outcome of the literature review of AHP evaluation studies is three sets of the most frequently used evaluation criteria for the different domains examined (e-government, e-education/e-university, e-health). In order to demonstrate the usefulness of the proposed sets of criteria in real-world evaluation studies, we conducted one evaluation experiment per domain examined. [ABSTRACT FROM AUTHOR]
- Published
- 2018
- Full Text
- View/download PDF
20. Privacy, security and policies: A review of problems and solutions with semantic web technologies.
- Author
-
Kirrane, Sabrina, Villata, Serena, d'Aquin, Mathieu, d'Aquin, Kirrane, and Villata
- Subjects
SEMANTIC Web ,WEBSITES ,GOVERNMENT policy ,WEBSITE security - Abstract
Semantic Web technologies aim to simplify the distribution, sharing and exploitation of information and knowledge, across multiple distributed actors on the Web. As with all technologies that manipulate information, there are privacy and security implications, and data policies (e.g., licenses and regulations) that may apply to both data and software artifacts. Additionally, semantic web technologies could contribute to the more intelligent and flexible handling of privacy, security and policy issues, through supporting information integration and sense-making. In order to better understand the scope of existing work on this topic we examine 78 articles from dedicated venues, including this special issue, the PrivOn workshop series, two SPOT workshops, as well as the broader literature that connects the Semantic Web research domain with issues relating to privacy, security and/or policies. Specifically, we classify each paper according to three taxonomies (one for each of the aforementioned areas), in order to identify common trends and research gaps. We conclude by summarising the strong focus on relevant topics in Semantic Web research (e.g. information collection, information processing, policies and access control), and by highlighting the need to further explore under-represented topics (e.g., malware detection, fraud detection, and supporting policy validation by data consumers). [ABSTRACT FROM AUTHOR]
- Published
- 2018
- Full Text
- View/download PDF
21. Modeling an web community discovery method with web page attraction.
- Author
-
Lei, Shi
- Subjects
WEBSITES ,ALGORITHMS ,GRAVITATION - Abstract
An improved Web community discovery algorithm is proposed in this paper based on the attraction between Web pages to effectively reduce the complexity of Web community discovery. The proposed algorithm treats each Web page in the Web pages collection as an individual with attraction based on the theory of universal gravitation, elaborates the discovery and evolution process of Web community from a Web page in the Web pages collection, defines the priority rules of Web community size and Web page similarity, and gives the calculation formula of the change in Web page similarity. Finally, an experimental platform is built to analyze the specific discovery process of the Web community in detail, and the changes in cumulative distribution of Web page similarity are discussed. The results show that the change in the similarity of a new page satisfies the power-law distribution, and the similarity of a new page is proportional to the size of Web community that the new page chooses to join. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
22. UCrawler: A learning-based web crawler using a URL knowledge base.
- Author
-
Wang, Wei and Yu, Lihua
- Subjects
KNOWLEDGE base ,WEBSITES ,SEARCH algorithms ,ALGORITHMS ,COMPUTATIONAL complexity ,WEB search engines ,SEARCH engines - Abstract
Focused crawlers, as fundamental components of vertical search engines, focus on crawling the web pages related to a specific topic. Existing focused crawlers commonly suffer from the problems of low efficiency of crawling pages and subject migration. In this paper, we propose a learning-based focused crawler using a URL knowledge base. To improve the accuracy of similarity, the similarity of the topic is measured with the parent page content, anchor information, and URL content. The URL content is also learned and updated iteratively and continuously. Within the crawler, we implement a crawling mechanism based on a combination of content analysis and simple link analysis crawler strategy, which decreases computational complexity and avoids the locality problem of crawling. Experimental results show that our proposed algorithm achieves a better precision than traditional methods including the shark-search and best-first search algorithms, and avoids the local optimum problem of crawling. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
23. Organic consumption through ICT: A moderated mediation model of consumer attitude and perceived irritation.
- Author
-
Tariq, Anum and Tanveer, Yasir
- Subjects
CONSUMER attitudes ,VIRTUAL communities ,CONSUMERS' reviews ,WEBSITES ,ORGANIC products ,ONLINE shopping ,INFLUENCE - Abstract
BACKGROUND: The proliferation of ICT has transformed customers' buying patterns. The organic consumption patterns based on online consumers' reviews were reviewed through moderated mediation model. This research paper examines the mediating role of consumer attitude in the relationship between online consumers' reviews and organic consumption. It further examines, if perceived irritation moderates this effect. METHODS: Data were collected from 287 respondents who were experienced in online buying of organic products. Applied the PROCESS macros in IBM SPSS Statistics 23 to assess latent hypothesized relationships. IMPLICATIONS: These results have urged marketers to reconsider their strategies to treat modern consumers and scheming more artistic web pages braced with social feedback mechanism to stream organic consumption. CONCLUSIONS: The results supported that online consumers' reviews positively influences organic consumption of Chinese. However, this effect is stronger when perceived irritation of webpages is low. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
24. Mining the information architecture of the WWW using automated website boundary detection.
- Author
-
Alshukri, Ayesh and Coenen, Frans
- Subjects
WEBSITES ,INTERNET content ,SPAM email ,COMPUTER network resources ,INTERNET - Abstract
The world wide web has two main forms of architecture, the first is that which is explicitly encoded into web pages, and the second is that which is implied by the web content, particularly pertaining to look and feel. The latter is exemplified by the concept of a website, a concept that is only loosely defined, although users intuitively understand it. The Website Boundary Detection (WBD) problem is concerned with the task of identifying the complete collection of web pages/resources that are contained within a single website. Whatever the case, the concept of a website is used with respect to a number of application domains including; website archiving, spam detection, and www analysis. In the context of such applications it is beneficial if a website can be automatically identified. This is usually done by identifying a website of interest in terms of its boundary, the so called WBD problem. In this paper seven WBD techniques are proposed and compared, four statistical techniques where the web data to be used is obtained apriori, and three dynamic techniques where the data to be used is obtained as the process progresses. All seven techniques are presented in detail and evaluated. [ABSTRACT FROM AUTHOR]
- Published
- 2017
- Full Text
- View/download PDF
25. Hierarchical features-based targeted aspect extraction from online reviews.
- Author
-
He, Jin, Li, Lei, Wang, Yan, and Wu, Xindong
- Subjects
INFORMATION modeling ,SEMANTICS ,WEBSITES - Abstract
With the prevalence of online review websites, large-scale data promote the necessity of focused analysis. This task aims to capture the information that is highly relevant to a specific aspect. However, the broad scope of the aspects of the various products makes this task overarching but challenging. A commonly used solution is to modify the topic models with additional information to capture the features for a specific aspect (referred to as a targeted aspect). However, the existing topic models, either perform the full analysis to capture features as many as possible or estimate the similarity to capture features as coherent as possible, overlook the fine-grained semantic relations between the features, resulting in the captured features coarse and confusing. In this paper, we propose a novel Hierarchical Features-based Topic Model (HFTM) to extract targeted aspects from online reviews, then to capture the aspect-specific features. Specifically, our model can not only capture the direct features posing target-to-feature semantics but also capture the latent features posing feature-to-feature semantics. The experiments conducted on real-world datasets demonstrate that HFTMl outperforms the state-of-the-art baselines in terms of both aspect extraction and document classification. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
26. Extraction of unexpected sentences: A sentiment classification assessed approach.
- Author
-
Dong (Haoyuan) Li, Laurent, Anne, Poncelet, Pascal, and Roche, Mathieu
- Subjects
DATA mining ,SEMANTICS ,MACHINE learning ,WEBSITES ,STATISTICS - Abstract
Sentiment classification in text documents is an active data mining research topic in opinion retrieval and analysis. Different from previous studies concentrating on the development of effective classifiers, in this paper, we focus on the extraction and validation of unexpected sentences issued in sentiment classification. In this paper, we propose a general framework for determining unexpected sentences. The relevance of the extracted unexpected sentences is assessed in the context of text classification. In the experiments, we present the extraction of unexpected sentences for sentiment classification within the proposed framework, and then evaluate the influence of unexpected sentences on the quality of classification tasks. The experimental results show the effectiveness and usefulness of our proposed approach. [ABSTRACT FROM AUTHOR]
- Published
- 2010
- Full Text
- View/download PDF
27. Novel JavaScript malware detection based on fuzzy Petri nets.
- Author
-
Lin, Yi-Nan, Wang, Sheng-Kuan, Yang, Cheng-Ying, Shen, Victor R.L., Juang, Tony Tong-Ying, and Wei, Chin-Shan
- Subjects
PETRI nets ,JAVASCRIPT programming language ,WEBSITES ,MALWARE - Abstract
Currently, JavaScript is a popular scripting language for building web pages. It allows website creators to run any program code they want when users are visiting their websites. Meanwhile, malicious JavaScript becomes one of the biggest threats in the cyber world. Researchers are now searching for a convenient and effective way to detect JavaScript malware. Consequently, this paper aims to propose a novel method of detecting the JavaScript malware by using a high-level fuzzy Petri net (HLFPN). First, the web pages are crawled to get JavaScript files. Second, those main features are extracted from JavaScript files. In total, six main features of the JavaScript, including longest word size, entropy, specific character, commenting style, function calls, and abstract syntax tree (AST) features are collected. Finally, an HLFPN model is used to determine whether the malicious code is available or not. The experimental results have fully demonstrated the effectiveness of our proposed approach. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
28. Is Estonian local e-government responsive to citizens' needs? The case study of Tartu.
- Author
-
Reinsalu, Kristina
- Subjects
LOCAL government ,INTERNET in public administration ,CITIZEN participation in public administration ,ESTONIAN politics & government ,CIVIL service ,INTERNET ,WEBSITES ,GOVERNMENT websites - Abstract
This paper examines citizens' interaction with local government. The main concern of the paper is how the citizens of Estonia use the Internet for local services. It presents an empirical study supported by direct observation of websites with an accompanying description of e- and m-services being implemented or planned, and provides data from a survey conducted among citizens of a major city. My findings show that the indicators of access to the Internet, use of the Internet and mobile phones are very high in all age groups. Also, the evaluation given by the people to the website of the city council is high in different categories. The e- and m-services the city government has implemented or is planning to implement are innovative. However, the citizens' readiness to use them is low. If there is any interest, it is mainly limited to every-day-life services like m-parking etc. Forums or other services implemented to provoke citizens' involvement are not attractive. Finally, my paper discusses the impact of these results, referring to the theoretical framework, and states that the key factor for interactive communication is motivation. Should interactive communication – even if only "consumerism" of entertainment services – become a routine, it might provide a solution to problems stemming from the perceived disconnection of political and administrative institutions from citizens' everyday concerns. [ABSTRACT FROM AUTHOR]
- Published
- 2006
29. Optimal rough fuzzy clustering for user profile ontology based web page recommendation analysis.
- Author
-
Mohanty, Sachi Nandan, Rejina Parvin, J., Vinoth Kumar, K., Ramya, K.C., Sheeba Rani, S., Lakshmanaprabu, S.K., Yuan, Xiaohui, and Elhoseny, Mohamed
- Subjects
WEBSITES ,DATA scrubbing ,DATA structures ,INTERNET of things ,FUZZY neural networks ,ONTOLOGIES (Information retrieval) - Abstract
Personalized information recommendation in view of social labeling is a hot issue in the scholarly community and this web page data collected from the Internet of Things (IoT). To accomplish personalized web pages, the current investigation proposes a recommendation framework with two methodologies on user access behavior using Rough-Fuzzy Clustering (RFC) technique. In this paper, Fuzzy-based Web Page Recommendation (WPR) framework is provided with the user profile and ontology design. At first, the weblog documents were gathered from IoT to clean the data and undergo learning process. In the profile ontology module, the learner profile was spared as the ontology with an obvious structure and data. For identification of the similar data, innovative similarity measure was considered and for effective WPR process, the generated rules in RFC were optimized with the help of Chicken Swarm Optimization (CSO) technique. Finally, these optimal rules-based output recommends e-commence shopping websites with better performances. A group of randomly-selected users was isolated and on the basis of the obtained data, their clustering was performed by cluster analysis. Based on the current proposed model, the results were analyzed with performance measures and a number of top recommended pages were provided to users compared to existing clustering tech-niques. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
30. Characteristics and categorization of services in CLOUBI: A CLOud-based UBIquitous architecture.
- Author
-
Salehan, Alireza, Deldari, Hossein, and Abrishami, Saeid
- Subjects
UBIQUITOUS computing ,CATEGORIES (Mathematics) ,CLOUD computing ,WEBSITES ,COMPUTER software ,COMPUTER network architectures - Abstract
Challenges in pure ubiquitous computing, including the limitation of being multi-domain, absence of a uniform namespace, impossible intensive mobility of users, limited resources, lack of scalability, intensive applications, and so forth, have led researchers to provide hybrid ubiquitous architectures that are generally cloud-based. However, various types of services have not been considered in the introduced hybrid architectures nor have any of these architectures provided a general categorization of services independent of application type. The current paper introduces a cloud-based ubiquitous architecture called as CLOUBI. The main purpose of this architecture is to remove restrictions on pure ubiquitous architectures. In addition, two general categories of services are presented in this architecture, of which one is based on the nature of services and the other on the distance between the requesting and responding entities, with both being independent of application type. In the nature-based categorization, services are divided into four types: data, context-aware, software, and hardware. In the distance-based categorization, services are of five types: near, local, remote, global, and far. In order to design CLOUBI architecture and formulate its proposed categorization, various criteria are examined, including the types of inputs, nature of requests, characteristics of an ideal ubiquitous architecture, and features expected from the services in hybrid architectures. In addition, the most important security concerns related to ubiquitous cloud computing architectures are discussed. [ABSTRACT FROM AUTHOR]
- Published
- 2018
- Full Text
- View/download PDF
31. Multi-modal deep distance metric learning.
- Author
-
Roostaiyan, Seyed Mahdi, Imani, Ehsan, and Baghshah, Mahdieh Soleymani
- Subjects
WEBSITES ,IMAGE retrieval ,ALGORITHMS ,INFORMATION retrieval ,DATA analysis - Abstract
In many real-world applications, data contain heterogeneous input modalities (e.g., web pages include images, text, etc.). Moreover, data such as images are usually described using different views (i.e. different sets of features). Learning a distance metric or similarity measure that originates from all input modalities or views is essential for many tasks such as content-based retrieval ones. In these cases, similar and dissimilar pairs of data can be used to find a better representation of data in which similarity and dissimilarity constraints are better satisfied. In this paper, we incorporate supervision in the form of pairwise similarity and/or dissimilarity constraints into multi-modal deep networks to combine different modalities into a shared latent space. Using properties of multi-modal data, we design multi-modal deep networks and propose a pre-training algorithm for these networks. In fact, the proposed network has the ability of learning intra- and inter-modal high-order statistics from raw features and we control its high flexibility via an efficient multi-stage pre-training phase corresponding to properties of multi-modal data. Experimental results show that the proposed method outperforms recent methods on image retrieval tasks. [ABSTRACT FROM AUTHOR]
- Published
- 2017
- Full Text
- View/download PDF
32. A semantic approach to cross-document person profiling in Web.
- Author
-
Emami, Hojjat, Shirazi, Hossein, and Barforoush, Ahmad Abdollahzadeh
- Subjects
WEBSITES ,INFORMATION retrieval ,ONLINE profiling ,ELECTRONIC records ,DATABASES - Abstract
The problem of cross-document person profiling aimed at identifying and linking person entities across Web pages and extracting their relevant structured information. In this paper, we specifically focus on the core task of person profiling problem, namely the attribute extraction task. For attribute extraction, the existing approaches face several challenges that two important of them include (i) syntactic and structure variation, and (ii) cross-sentence and cross-document information extraction. To alleviate these deficiencies and improve performance of existing methods, we propose a semantic attribute extraction approach relying on probabilistic reasoning. Our approach produces structured, meaningful profiles in which the resulting textual facts are linked to their possible actual meaning in a distant ontology. We evaluate our approach on standard profile extraction datasets. Experimental results demonstrate that our approach achieves better results when compared with several baselines and state of the art counterparts. The results justify that our approach is a promising solution to the problem of person profiling. [ABSTRACT FROM AUTHOR]
- Published
- 2017
- Full Text
- View/download PDF
33. Using online data sources to make query suggestions for children.
- Author
-
Soledad Pera, Maria and Yiu-Kai Ng
- Subjects
WEB search engines ,SEARCH engines ,WEBSITES ,CROWDSOURCING - Abstract
Existing popular web search engines have been widely used for retrieving information of interests by their users and offer query suggestions (QS) to assist them in exploring the wealth of information online. These search tools, however, are designed without any specific group of users in mind and thus are not tailored towards the specific needs of children, which can diminish their usability and design objectives when they are employed by children. Given the increasing use of the Web for educational and entertainment purposes by children, there is an urgent need to help them search the Web effectively. In this paper, we present a QS module, denoted CQS, which assists children in finding appropriate query keywords to capture their information needs by (i) analyzing content written for/by children, (ii) examining phrases and other metadata extracted from reputable (children's) websites, and (iii) using a supervised learning approach to rank suggestions that are appealing to children. CQS offers suggestions with vocabulary that can be comprehended by children and with topics of interest to them. We conducted a number of empirical studies using keyword queries initiated by children, besides gathering feedback on the usefulness of CQS-generated suggestions through crowdsourcing. The performance evaluation of CQS revealed the effectiveness of the methodology of CQS. In addition, it demonstrated that CQS-generated suggestions were preferred over suggestions provided by Bing and Yahoo! and at least as comparable to queries suggested by Google. [ABSTRACT FROM AUTHOR]
- Published
- 2017
- Full Text
- View/download PDF
34. An Approach for Evaluating Website Quality in Hotel Industry Based on Triangular Intuitionistic Fuzzy Numbers.
- Author
-
STANUJKIC, Dragisa, ZAVADSKAS, Edmundas Kazimieras, KARABASEVIC, Darjan, UROSEVIC, Snezana, and MAKSIMOVIC, Mladjan
- Subjects
WEBSITES ,HOTELS ,FUZZY numbers ,DECISION making ,PROBLEM solving - Abstract
Compared to fuzzy numbers, intuitionistic fuzzy numbers provide greater opportunities for solving complex decision-making problems, especially when they are related to ambiguities, uncertainties and vagueness. However, their use is more complex, especially when it comes to ordinary users. Therefore, in this paper an approach adopted for evaluating alternatives on the basis of a smaller number of some more complex evaluation criteria is proposed. The approach is based on the use of linguistic variables, triangular intuitionistic fuzzy numbers, and the Hamming distance. At the end, a case study of hotels' websites evaluation is given to demonstrate the practicality and effectiveness of the proposed approach, together with its limitations and weaknesses. Additionally, a new procedure for ranking intuitionistic fuzzy numbers is proposed and its use is verified. [ABSTRACT FROM AUTHOR]
- Published
- 2017
- Full Text
- View/download PDF
35. Evaluation of the Quality and Readability of Online Information about Alzheimer's Disease in China.
- Author
-
Chu, Yili, Xie, Qihui, Meng, Rongrong, Leng, Bing, and Cao, Zhenxiang
- Subjects
ALZHEIMER'S disease ,WEBSITES - Abstract
Background: With the increasing popularity of the internet, a growing number of patients and their companions are actively seeking health-related information online. Objective: The aim of this study was to assess the quality and readability of online information about Alzheimer's disease (AD) in China. Methods: A total of 263 qualified AD-related web pages from different businesses, governments, and hospitals were obtained. The quality of the web pages was assessed using the DISCERN tool, and the readability of the web pages was assessed using a readability measurement website suitable for the Chinese language. The differences in readability and quality between different types of web pages were investigated, and the correlation between quality and readability was analyzed. Results: The mean overall DISCERN score was 40.93±7.5. The government group scored significantly higher than the commercial and hospital groups. The mean readability score was 12.74±1.27, and the commercial group had the lowest readability score. There was a positive correlation between DISCERN scores and readability scores. Conclusions: This study presents an evaluation of the quality and readability of health information pertaining to AD in China. The findings indicate that there is a need to enhance the quality and readability of web pages about AD in China. Recommendations for improvement are proposed in light of these findings. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
36. A Stackelberg solution for fuzzy random competitive location problems with demand site uncertainty.
- Author
-
Uno, Takeshi, Katagiri, Hideki, and Kato, Kosuke
- Subjects
WEBSITES ,FUZZY measure theory ,RANDOM variables ,RANDOM numbers ,MATHEMATICAL variables - Abstract
This paper focuses on a Stackelberg location problem on a tree network with demands whose sites are given uncertainly and vaguely. By representing their sites as fuzzy random variables on it, the distances from demands to facilities can be defined as fuzzy random numbers, and then the location problem can be formulated as a fuzzy random bilevel programming problem. For solving the problem, first we introduce the α-level set for fuzzy random numbers and transfer it to a random bilevel programming problem. Next, we consider the situation that a leader gives a guaranteed probability for her/his objective function value. Then, by adding the constraint under it to both decision makers, it can be reformulated as a bilevel programming problem, which is a version of conventional Stackelberg location problem. Finally its complexity is shown, and then a solution method for the problem if the leader locates one facility is also shown based upon the characteristics of facility location. [ABSTRACT FROM AUTHOR]
- Published
- 2012
37. Being transparent or spinning the message? An experiment into the effects of varying message content on trust in government.
- Author
-
Meijer, Albert, Bannister, Frank, and Grimmelikhuijsen, Stephan
- Subjects
TRANSPARENCY in government ,GOVERNMENT policy ,INFORMATION & communication technologies ,GOVERNMENT information ,GOVERNMENT websites ,RELIABILITY (Personality trait) ,FREEDOM of information - Abstract
Computer-mediated transparency is widely acknowledged to be a powerful instrument to strengthen citizen trust in government. However, government websites are often used as a convenient way to spread `spinned' policy messages with highly positive interpretations of government policies. This paper focuses on this particular element of transparency: the extent to which a policy message contains balanced information. A truly balanced message should also mention dissenting viewpoints of government policies. This study examines the effect on trust of a balanced message compared to messages subject to varying degrees of spin. An experiment was designed to compare the effect of a very positive policy message, a slightly positive message and a message containing both positive and negative information. The results demonstrate that a balanced message on a website about government policy leads to negative evaluations of government competence to solve policy problems. Further, less spin does not positively affect the perceived honesty and benevolence of a government organization. This study suggests that showing balanced content might not be helpful when it comes to increasing trust in government, and that people might even prefer a light form of spin on government information as it provides the image that government knows what it is doing and where it is heading. [ABSTRACT FROM AUTHOR]
- Published
- 2011
- Full Text
- View/download PDF
38. Insightful slideshow: Automatic composition of personal photograph slideshow using the web.
- Author
-
Yatsugi, Kotaro, Fujimura, Naomi, and Ushiama, Taketoshi
- Subjects
DIGITAL control systems ,COMPUTERS ,AUTOMATIC control systems ,HIGH technology ,WEBSITES ,COMPUTER network resources - Abstract
Recently, the number of digital content objects is increasing rapidly with the progress of information technology. It has become important how we manage enormous digital content objects effectively and utilize them efficiently. Up to now, a lot of researches on digital content management have been reported. One of the important objectives of conventional management techniques is to search digital content objects that satisfy an information request of a user. This is based on the assumption that a user has one or more information requests. However, a user may have no information request explicitly when the user uses some kinds of devices for presenting digital content such as a digital photoframe. Such devices are expected to provide a presentation of digital content that rouses user's interest. In this paper, we introduce an approach for composing photo slideshow that attracts user's interest automatically. [ABSTRACT FROM AUTHOR]
- Published
- 2010
- Full Text
- View/download PDF
39. Estimating the size and evolution of categorised topics in web directories.
- Author
-
Anagnostopoulos, Ioannis and Anagnostopoulos, Christos-Nikolaos
- Subjects
WEBSITES ,ELECTRONIC directories ,EXPERIMENTS ,ARTIFICIAL intelligence ,SIMULATION methods & models - Abstract
In this paper a statistical approach for estimating the evolution of categorized web page populations in web directories is proposed. The proposal is based on the capture-recapture method used in wildlife biological studies and it is modified according to the necessary assumptions and amendments for conducting the experiments on the web. During these experiments, web pages are likened to animals and the specific categories of web pages are likened to particular species of animals whose abundance, birth and survival rates are estimated. The capture-recapture model followed is a model that allows us to consider the populations under study as open. Thus, in the course of time the population evolves, meaning that new web pages are inserted in the study, while others are removed or become inactive, resembling the natural processes of migration or death. Artificial intelligence classifiers, capable of categorizing web pages, play the role of the biologists who recognize the species under study. In our work, four different simulations were conducted in order to evaluate the robustness of the model followed on the web paradigm, based on four different real classification cases. The paper provides the implementation details of our proposed web-based capture-recapture model, along with its initial assessment. [ABSTRACT FROM AUTHOR]
- Published
- 2010
- Full Text
- View/download PDF
40. A cascade architecture for DoS attacks detection based on the wavelet transform.
- Author
-
Dainotti, Alberto, Pescapé, Antonio, and Ventre, Giorgio
- Subjects
LOCAL area networks ,WEBSITES ,ANOMALY detection (Computer security) ,ECONOMICS ,COMPUTER industry ,INTERNET - Abstract
In this paper we propose an automated system able to detect volume-based anomalies in network traffic caused by Denial of Service (DoS) attacks. We designed a system with a two-stage architecture that combines more traditional change point detection approaches (Adaptive Threshold and Cumulative Sum) with a novel one based on the Continuous Wavelet Transform. The presented anomaly detection system is able to achieve good results in terms of the trade-off between correct detections and false alarms, estimation of anomaly duration, and ability to distinguish between subsequent anomalies. We test our system using a set of publicly available attack-free traffic traces to which we superimpose anomaly profiles obtained both as time series of known common behaviors and by generating traffic with real tools for DoS attacks. Extensive test results show how the proposed system accurately detects a wide range of DoS anomalies and how the performance indicators are affected by anomalies characteristics (i.e. amplitude and duration). Moreover, we separately consider and evaluate some special test-cases. [ABSTRACT FROM AUTHOR]
- Published
- 2009
- Full Text
- View/download PDF
41. Exploring the core: An examination of required courses in ALA-accredited.
- Author
-
Hall, Russell A.
- Subjects
CURRICULUM ,EDUCATIONAL accreditation ,EDUCATIONAL programs ,INFORMATION science ,LIBRARY science ,WEBSITES ,RESEARCH ,INFORMATION technology ,STUDENTS - Abstract
This paper examines the required courses of ALA-accredited Library and Information Science programs as published on their websites. The study expands on previous research in this area. Findings show that the typical core curriculum has grown to include both research and information technology in addition to the more traditional subjects. The number of programs that require a secondary set of required courses (semi-core curriculum) in addition to the primary core curriculum increased. The paper concludes by suggesting that LIS programs keep a robust general core curriculum while considering the "career-track" model to help students best meet their professional objectives. [ABSTRACT FROM AUTHOR]
- Published
- 2009
- Full Text
- View/download PDF
42. Detecting data records in semi-structured web sites based on text token clustering.
- Author
-
Xiaoying Gao, Le Phong Bao Vuong, and Mengjie Zhang
- Subjects
WORLD Wide Web ,INTERNET ,XML (Extensible Markup Language) ,WEBSITES ,ALGORITHMS - Abstract
This paper describes a new approach to the use of clustering for automatic data detection in semi-structured web pages. Unlike most exiting web information extraction approaches that usually apply wrapper induction techniques to manually labelled web pages, this approach avoids the pattern induction process by using clustering techniques on unlabelled pages. In this approach, a variant Hierarchical Agglomerative Clustering (HAC) algorithm called K-neighbours-HAC is developed which uses the similarities of the data format (HTML tags) and the data content (text string values) to group similar text tokens into clusters. We also develop a new method to label text tokens to capture the hierarchical structure of HTML pages and an algorithm for mapping labelled text tokens to XML. The new approach is tested and compared with several common existing wrapper induction systems on three different sets of web pages. The results suggest that the new approach is effective for data record detection and that it outperforms these common existing approaches examined on these web sites. Compared with the existing approaches, the new approach does not require training and successfully avoids the explicit pattern induction process, and accordingly the entire data detection process is simpler. [ABSTRACT FROM AUTHOR]
- Published
- 2008
- Full Text
- View/download PDF
43. Visualising official statistics.
- Author
-
ten Bosch, Olav and de Jonge, Edwin
- Subjects
STATISTICS & society ,VIDEO games ,VISUAL programming languages (Computer science) ,COMPUTER-generated imagery ,WEBSITES ,EXPERIMENTAL design - Abstract
Data are everywhere: on the internet, in newspapers, in computer games, at school, at work and on TV. Data are essential to our daily information needs and people are increasingly becoming used to reading and interpreting data. Companies use advanced technologies to present their data in fancy and visually attractive ways. Animated and interactive graphs, mapping tools, charting components are now common techniques, and many examples of attractive and well-designed data visualisation are to be found on the internet in particular. Many of these concepts can be used to visualise official statistics. Indeed, we think people expect official statistics to follow these trends so that they can use the same tools and concepts to consult official statistics. This paper presents an overview of visualisation trends on the internet. It highlights visualisation initiatives that can currently be found on the web and are likely to be useful for official statistics. It also describes the status of visualisation activities at Statistics Netherlands, in the field of regional statistics as well as other more specific data visualisations, and touches on some of the more experimental features currently being developed. [ABSTRACT FROM AUTHOR]
- Published
- 2008
44. Identifying a hierarchy of bipartite subgraphs for web site abstraction.
- Author
-
Cheung, William K. and Yuxiang Sun
- Subjects
BIPARTITE graphs ,WEBSITES ,DATA mining ,PROBLEM solving ,DATABASE searching - Abstract
The Web is transforming from a merely information dissemination platform towards a distributed knowledge-based platform for supporting complex problem solving. However, the existing Web contains a large amount of knowledge which is only tagged using layout related markups, making them hard to be discovered and used. In this paper, we purpose to model semantic-rich and self-contained knowledge units embedded in a web site as a mixture of bipartite sub-graphs and to extract the subgraphs as the web site abstraction via hyperlink structure and file hierarchy analysis. A recursive algorithm, named ReHITS, is derived which can identify bipartite sub-graphs with a hierarchical organization. Each identified sub-graph contains a set of associated authorities and hubs as its summarized semantic description. The effectiveness of the algorithm has been evaluated using three real web sites (containing ∼ 10000 web pages) with promising results. Detailed interpretation of the experimental results and qualitative comparison with other related work are also included. [ABSTRACT FROM AUTHOR]
- Published
- 2007
45. Ontology construction and concept reuse with formal concept analysis for improved web document retrieval.
- Author
-
Cho, W. C. and Richards, D.
- Subjects
WORLD Wide Web ,SEARCH engines ,WEBSITES ,KEYWORD searching ,INTERNET domain names ,ELECTRONIC information resource searching - Abstract
The World Wide Web (WWW) has become the most popular place to collect information. However the exponential growth in the size of the WWW makes it difficult for people to find what they are looking for. Even though about 85% of engines often do not return information found to be relevant to the user. The main focus of this paper is to improve search performance by reusing keywords and Web pages which have been previously used or visited by other users. The Formal Concept Analysis (FCA) method has been adapted to maintain a concept map for the reuse of knowledge. This paper shows that both precision and recall have been improved when the technique was employed by users sharing the same knowledge in a specific-domain area. [ABSTRACT FROM AUTHOR]
- Published
- 2007
46. Evaluating public administration e-portals.
- Author
-
Leben, Anamarija, Kunstelj, Mateja, Bohanec, Marko, and Vintar, Mirko
- Subjects
GOVERNMENT websites ,WEB portals ,WEBSITES ,AUTOMATION of public administration ,INTERNET in public administration ,CITIZEN participation in public administration ,MUNICIPAL services ,PUBLIC administration ,ELECTRONIC government information - Abstract
Electronic administration portals have become the basic platform for electronic administrative services for users (citizens and businesses). The quality and user-friendliness of this supply depends to a large extent on how well planned and developed it is. The first part of the paper presents some of the more notable approaches to measuring the development or sophistication of e-services and administrative e-portals. Part two presents a methodology for evaluating e-portals developed at the Faculty of Administration – its strength lies primarily in its attempt to provide a more comprehensive evaluation of life-event portals. It measures the level of sophistication, coverage, coordination and accessibility of a service and combines these into an overall portal score. The paper closes by presenting empirical results from the application of the methodology to assess 12 administrative portals from around the world. [ABSTRACT FROM AUTHOR]
- Published
- 2006
47. Explaining Internet service quality in social security agencies using institutional theory.
- Author
-
Toots, Anu
- Subjects
SOCIAL security ,INTERNET ,PUBLIC administration ,WEBSITES ,WEBSITE management ,MUNICIPAL services ,INSTITUTIONAL theory (Sociology) ,GOVERNMENT insurance - Abstract
This paper tries to apply institutional theory to the analysis of Internet-based social security services. I argue that institutional features of service providers matter significantly in the quality of websites. Diverse quality of websites in turn, is a crucial factor in explaining what causes different take-up rates of e-services. The article tests the relevance of empirical and normative accounts of institutional theory to explanation of the quality of public e-services. The websites of five institutions, which provide social insurance in Estonia, serve as the empirical base for the research. Results indicate that the extent of power (de)concentration is a more crucial variable in having citizen-oriented websites than the policy content or public-private ownership. [ABSTRACT FROM AUTHOR]
- Published
- 2006
48. Does transparency strengthen legitimacy?
- Author
-
Curtin, Deirdre and Meijer, Albert Jacob
- Subjects
INTERNET in public administration ,LEGITIMACY of governments ,WEBSITES ,TRANSPARENCY (Optics) - Abstract
Does enhanced transparency, through the Internet, boost the legitimacy of the EU? In this paper we present a critical perspective on the assumptions underlying the relation between transparency and legitimacy. We reconstruct three assumptions from EU policy documents – transparency strengthens input legitimacy, output legitimacy and social legitimacy – and then highlight several weaknesses. We conclude that transparency is a key element of democratic institutions but naïve assumptions about the relation between transparency and legitimacy can and should be avoided. We warn against a simplified trust in the benefits of the Internet: enhancing legitimacy is much more complicated than creating fancy websites. [ABSTRACT FROM AUTHOR]
- Published
- 2006
- Full Text
- View/download PDF
49. Modeling of modularity and scaling for integration of customer in design of engineer-to-order products.
- Author
-
Siddique, Zahed and Ninan, Jiju A.
- Subjects
CUSTOMIZATION ,COMMERCIAL products ,WEBSITES ,COMPUTER-aided design ,FINITE element method ,COMPUTER software - Abstract
To survive in today's volatile market, companies are striving to deliver greater quality, more customization and innovative designs by offering user customizable products. Mass customization features products that can be altered or changed by the customer to fit his/her needs. In order to incorporate customer specifications, it is necessary to integrate the customer into the design process. Design tools and methodologies need to be altered to accommodate the customer into the process of designing customized products. This paper presents a web based framework to provide customizable products in real time by integrating the customer into the design process. The approach is based on templates to generate automatically Computer Aided Design (CAD) models from customer specifications. Structural feasibility of the user specified products are then evaluated using automated Finite Element Analysis (FEA), optimized and communicated back to the user in real time. The CAD and FEA product family template generalizes the rules and guidelines for entire product family. The template, optimization formulations, and the framework is implemented using commercial software. The applicability of the system is illustrated for web based customization of bicycle frames. [ABSTRACT FROM AUTHOR]
- Published
- 2006
- Full Text
- View/download PDF
50. Personalisation in news delivery systems: Item summarization and multi-tier item selection using relevance feedback.
- Author
-
Díaz, Alberto and Gervás, Pablo
- Subjects
INFORMATION retrieval ,WEBSITES ,INFORMATION resources management ,ELECTRONIC information resources ,INFORMATION science - Abstract
The designer of an information dissemination system based on user preferences stated as user models is currently faced with three basic design decisions: whether to use categories, keywords – or both – to enable the user to specify his preferences, whether to use a static long-term model or a dynamic short-term model to register those preferences, and what method to use to provide summaries of the available documents without losing information that may be significant to a particular user even if it would not be considered as such in general terms. Current systems tend to provide one specific choice – either taken at design time by the developer or offered as mutually exclusive alternatives to the user. However, most of the options have relative merits. An efficient way of combining the various solutions would allow users to select in each case the combination of alternatives better suited for their needs. In this paper we defend the use of a combined approach that integrates: an enriched user-model that the user can customise to capture his long-term interests either in terms of categories (newspaper sections) or keywords, a personalised summarization facility to maximise the density of relevance of sent selections, and a tailored relevance feedback mechanism that captures short-term interests as featured in a user's acceptance or rejection of the news items received. Controlled experiments were carried out with a group of users and satisfactory results were obtained, providing material for further development. The experimental results suggest that categories and keywords can be fruitfully combined to express user interests, and that personalised summaries perform better than generic summaries at least in terms of identifying documents that satisfy user preferences. [ABSTRACT FROM AUTHOR]
- Published
- 2005
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.