Todd Suomela's Library tagged → View Popular, Search in Google
"Go to any social gathering in your neighborhood and you will notice that people interact mostly with others who are similar in terms of age, gender, race, attributes, and behaviors. This tendency of people to have similar friends—known as homophily—is one of the most pervasive features of social networks (1). A key question is how much of the homophily in behavior can be attributed to social diffusion, that is, direct causal influence of one person on another through social ties (2, 3). Results from two clever Internet experiments reported by Centola last year (4) and on page 1269 of this issue (5) shed light on how the particular arrangement of social ties promotes social diffusion."
"Decisions about when and how to regulate activities online will have a profound societal impact. Debates underlying such decisions touch upon fundamental problems related to economics, free expression, and privacy. Their outcomes will influence the structure of the Internet, how data can flow across it, and who will pay to build and maintain it. Most striking about these debates are the paucity of data available to guide policy and the extent to which policy-makers ignore the good data we do have."
-
he best approach is neither to make ill-informed decisions based on too little data nor to avoid state regulation simply because of the absence of decent data. Instead, we should begin a concerted push for highly reliable and publicly available forms of measurement of the Internet and the Web and how we use them, including the flows of information we generate and consume. Better data would do more than just help the state meet its regulatory obligations; better data would also improve self-regulation by private sector players and empower individuals to make better decisions. In the meantime, we as researchers need to work harder to translate the good data that we do have into terms that can directly inform policy-making.
"The hard lesson for governments is that citizens will adopt technology when it is both optional and beneficial to them, but resist it strenuously when it is compulsory, no matter how sensible it may seem. To take another example, if users of public transport in London were told that in future all their trips would be logged by the authorities, they would revolt. But offered lower fares if they use an Oyster card, issued by a branch of government called Transport for London, they have few objections. Nor do they seem to mind much that the same body photographs their car every time they visit central London on a working day to enforce the capital's congestion charge.
Oddly, people seem to mind even less about how much information the private sector holds about them. Supermarket loyalty cards record all their purchases, however revealing, and search engines note everything they have been looking for on the internet. People who would strongly resist giving any personal information to the government are quite happy for Google to know that they have been searching for “hot Asian babes”. The result, says Microsoft's Mr Cameron, is pernicious. “Hundreds of millions of people have been trained to accept anything any site wants to throw at them as being the 'normal way' to conduct business online.”"
-
Cybercrime discredits the use of the internet not only by business but by government too. Mr Cameron suggests rethinking the whole issue, starting from the principle that users may be identified only with their explicit consent. That sounds commonsensical, but many big government databases do things differently. Britain's planned central records for the NHS, for example, will assume consent as it combines all the medical records held in local practice databases.
The second principle, says Mr Cameron, should be to keep down the risk of a breach by using as little information as possible to achieve the task in hand. This approach, which he calls “information minimalism”, rules out keeping information “just in case”. For example, if a government agency needs to check if someone falls into a certain age group, it is far better to acquire and store this information temporarily as a “yes” or “no” than to record the actual date of birth permanently, which would be much more personal and therefore more damaging if leaked.
An increase in the number of citizen science programs has prompted an examination of their ability to provide data of sufficient quality. We tested the ability of volunteers relative to professionals in identifying invasive plant species, mapping their distributions, and estimating their abundance within plots. We generally found that volunteers perform almost as well as professionals in some areas, but that we should be cautious about data quality in both groups. We analyzed predictors of volunteer success (age, education, experience, science literacy, attitudes) in training-related skills, but these proved to be poor predictors of performance and could not be used as effective eligibility criteria. However, volunteer success with species identification increased with their self-identified comfort level. Based on our case study results, we offer lessons learned and their application to other programs and provide recommendations for future research in this area.
"Scrapy is a fast high-level screen scraping and web crawling framework, used to crawl websites and extract structured data from their pages. It can be used for a wide range of purposes, from data mining to monitoring and automated testing."
" We are now interconnected enough for amateur scientists and hobbyists to help professional scientists with sensors scattered over the earth collecting data. "
You have the picture of a graph but not the corresponding data? You want to retrieve the trajectory of an object from a QuickTime movie? GraphClick is then simply the best way to solve the problem! You just have to click on the image and the obtained coordinates of the points can be directly exported into any other application.
A hierarchy of needs for data-curation: acquisition, physical medium, bitrot, format viability, usability, fidelity to original.
Since 1993, disseminating scientific research information useful in preventing, mitigating, or adapting to the effects of global change.
There is growing interest in the issues of preservation and re-use of the records of science, in the "digital era". The aim of the PARSE.Insight project, partly financed by the European Commission under the Seventh Framework Program, is twofold: to provide an assessment of the current activities, trends and risks in the field of digital preservation of scientific results, from primary data to published articles; to inform the design of the preservation layer of an emerging e-Infrastructure for e-Science. CERN, as a partner of the PARSE.Insight consortium, is performing an in-depth case study on data preservation, re-use and (open) access within the High-Energy Physics (HEP) community. The first results of this large-scale survey of the attitudes and concerns of HEP scientists are presented. The survey reveals the widespread opinion that data preservation is "very important" to "crucial". At the same time, it also highlights the chronic lack of resources and infrastructure to tackle this issue, as well as deeply-rooted concerns on the access to, and the understanding of, preserved data in future analyses.
LittleSis is an involuntary facebook of powerful Americans, collaboratively edited by people like you.
A simple to learn and use, yet very powerful web extraction framework written in Ruby. Navigate through the Web, Extract, query, transform and save relevant data from the Web page of your interest by the concise and easy to use DSL.
Selected Tags
Related Tags
Top Contributors
Groups interested in data-col...
Diigo is about better ways to research, share and collaborate on information. Learn more »
Join Diigo
