herraiz.org | Blog
Main | Blog | Research papers | PhD thesis | GnuPG (PGP)
Popularity bias in bug datasets
In recent times, the replicability of Software Engineering empirical studies has become a main concern in the research community. One way to achieve replicability is by reusing datasets, so everybody base their results on the same data. However, if these datasets contain any kind of problem, they could cause more harm than benefits.
In the case of software defects, there are datasets that are known to contain bias, mainly when referencing a fix to a particular bug report.
We have studied a different kind of bias: popularity bias. A software project with less bugs is of higher quality. However, in open source software development, more bugs may mean more quality. Why? Because more found bugs imply more people looking for those bugs. This is, if you have no bugs it is because nobody is using your software and reporting them. If you have more bugs, it is because your software is popular; should your software be less popular, the number of bugs would be lower. We have studied this effect in the case of Debian, using the Ultimate Debian Database, and we indeed find that only very popular Debian packages will present a very high number of bugs, and that non-popular packages get very few bug reports.
If you want to know more, read our WCRE 2011 paper, entitled "Impact of Installation Counts on Perceived Quality: A Case Study on Debian". A tag cloud of the contents of the paper:
age analysis binary case data days debian defects developers different distribution engineering fixed groups higher installations number open packages popularity quality recent relationship reported shows software source study system userscreated at TagCrowd.comTo cite this paper, there is a BibTeX file available, or you can copy from below
@InProceedings{debian_wcre2011, author = {Israel Herraiz and Emad Shihab and Thanh H.D. Nguyen and Ahmed E. Hassan}, title = {Impact of Installation Counts on Perceived Quality: A Case Study on {D}ebian}, booktitle = {Proceedings of the 18th Working Conference on Reverse Engineering}, year = {2011}, publisher = {IEEE Computer Society}, }
Written on Nov 01 2011 | Comments »IJSODIT - Call for papers 2012
The International Journal of Social and Organizational Dynamics in Information Technology (IJSODIT) calls for papers for its 2012 issues.
The mission of this journal relates to social issues in information technology. Social issues are those research topics most aligned with the human factor in terms of information systems planning, development and utilization. This journal includes all aspects of social issues that are impacted by information technology affecting organizations and interorganizational structures. This includes the conceptualization of specific social issues and their associated constructs, proposed designs and infrastructures, empirical validation of social models, and case studies illustrating socialization success and failures. Some key topics may include:
- Ethics
- Culture
- Relationships
- Human interaction
- Security
- Design
- Building relationships
- Diversity in the IT workforce
This journal follows a full blind peer review process.
Written on Sep 29 2011 | Comments »The interplay between businesses and open source
This month, the IEEE Software magazine comes with an interesting article about the impact and possibilities of different open source licenses on business models. The paper is available at the IEEE Digital library: Matching Open Source Software Licenses with Corresponding Business Models. From the abstract:
Scores of software producers have turned toward open source licenses to improve service for their customers. For these companies, choosing the correct license determines business success. When the available open source stack and licensing options grow, so does the need to understand the interplay among licensing, sourcing decisions, and business goals. A model of license choice emphasizes different licenses and rationalizes the choice of an open source software (OSS) license. This is crucial for smaller companies and start-ups that don't have the tools and knowledge to perform a thorough investigation of all the consequences of their license choice every time they employ OSS.
Furthermore, the Computer magazine also brings an interesting article about how to manage open source projects, from a point of view of a software firm. It is also available in the IEEE Digital library: Controlling and Steering Open Source Projects.
Please bear in mind that the IEEE Digital library is a paywall. Send me a message if you want to have a look; I can send you a copy if you cannot access the papers.
Written on Sep 08 2011 | Comments »
Older posts
- Software and the game of life (Jul 29 2011)
- What's the distribution of software size? (Jul 20 2011)
- Software projects alzheimer: Julian Assange's lost contributions (Jul 07 2011)
- Practical Analyses of Software Engineering Data (Jun 15 2011)
- Empirical Software Engineering in Practice -- CFP 2011 (Jun 13 2011)
- Grafiti no es negocio -- Mi visión sobre las acampadas (May 25 2011)
- IJSODIT - Call for papers 2011 (Mar 29 2011)
- Mis impresiones sobre el Día Garum (Mar 05 2011)
- Nos vamos a Bilbao (Feb 15 2011)
- Reflexiones sobre el ciberpunk (Feb 03 2011)
- The dynamics of software evolution (Jan 24 2011)
- ¿Cómo he llegado al itinerario? (Jan 10 2011)
- ¡Hola itinerario! (Jan 04 2011)
- Debian finally shipping a free kernel (Dec 15 2010)
- Freenet, an anonymous and distributed network (Dec 11 2010)
- PyTwerp working again with Twitter (Dec 10 2010)
- "Making software" is out! (Nov 22 2010)
- Do featured articles get more visits in Wikipedia? (Nov 15 2010)
- What is the MSR challenge? (Oct 11 2010)
- IWESEP 2010 -- International Workshop on Empirical Software Engineering in Practice (Aug 23 2010)
- Learning by doing (Aug 10 2010)
- Data for Mining Software Repositories (Jun 25 2010)
- The eye of the tiger: agile methods vs. architecture (Jun 21 2010)
- Code as design. Or what's the point of Software Engineering? (Apr 06 2010)
- Hello Linkedin (Apr 02 2010)
- Special issue of the IJOSSP (Feb 23 2010)
- Where are you? (Feb 05 2010)
- New GPG key (Jan 27 2010)
- Under attack (Jan 19 2010)
- Hello world (Jan 18 2010)