Tuesday, February 26, 2013

Reflections on my Progress

Team Obsidian is confirmed for having a booth at POSSCON. The team sat down tonight to discuss the work we need to finish. The tutorials on the Wiki as well as some small changes to the code still need to be made, but everyone on the team seems to be getting better acquainted with the code-base. So now that we know we have a booth we need to determine what our plan is for the table. Everyone agreed that a demo would be cool, that way we can show that this isn't just a planned project but one that is actually in a workable form so the theory of Stone Soup can be applied (Stone Soup being the idea that if you have a workable base other people are more interested in helping/giving). So we are planning on finding other open source Java projects to run Obsidian on during our time at the table. Another idea is to have another "floating" computer for people to type in their information so we can email them about the project later down the line to remind them to check it out. We have the stickers that we designed for the project, and they look amazing, so we will also be giving away stickers to people who come up to the table.

The documentation that I chose to work on consisted of "How to Contribute" and "How to Run". Contributing was harder then the how to run in my opinion. Coming up with the rules and norms of how to submit a patch as well as standard coding practices we used was a little difficult. I emulated some of the bigger open source projects while trying to keep in mind that we are a smaller project, so automated scripts and automated testing are a little bit of overkill. During the meeting we discussed how people can claim bugs, and our preferred way of submitting the solutions. Another thing that is currently in the contributing page is how to get and build from source. I put it here originally because contributors will need to be able to get the source, but after running a few tests with our fellow students a lot of them went to the "How to Run" page to build from source. I'm planning on talking to the group about this to either make a separate page for source code building, or linking the "How to Run" page to the section in contributing.

Everyone is busy working on their bugs and documentation, but lots of us have been feeling a strain with the semester coming into full swing. I know personally I need to work on other classes a little more, so before I go to POSSCON I'll be sitting down and working on all the homework for the next week so I don't have to worry while I'm at POSSCON.

Thursday, February 21, 2013

Refactoring Mindset

What is good code? Code that runs, right? Well, if you never have to go back and work on the code then yes, that would be the definition of good code. But in the open source community there will always be a problem with people who need to come in after you to fix bugs. Readability starts coming into play when you work with someone other then yourself. If I were to ask you what this code segment does, you would have to play with it for a long time just to understand what does what:


#include stdio.h
#include math.h
//What does this do?
double l;main(_,o,O){return putchar((_--+22&&_+44&&main(_,-43,_),_&&o)?(main(-43,++o,O),((l=(o+21)/sqrt(3-O*22-O*O),l*l<4&&(fabs(((time(0)-607728)%2551443)/405859.-4.7+acos(l/2))<1.57))[" #"])):10);}

This code is what we would call "bad code", and to tell you the truth it was designed that way. This segment of code was written for the International Obfuscated C Code Contest and was written by Natori for the 15th international contest. His code prints out an ASCII moon to the terminal that is representative of the phase of the moon currently. The problem with code today is one of readability.

Refactoring is the idea of taking a segment of code and changing it to act the same externally while changing the internal workings to have reduced complexity as well as improved readability and maintainability. For our homework we had to take some code from the RMH Homebase and refactor a specific piece of code into something more manageable and readable.

if ($edit==true && !($days[6]->get_year()<$year || ($days[6]->get_year()==$year && $days[6]->get_day_of_year()<$doy) ) && $_SESSION['access_level']>=2)

Knowing little to no PHP did not help me with this problem, however looking at the code it is easy to see that they are looking for very specific parameters to be true or false. By taking the existing code and just making some additions to spacing and comments allows the next set of people to come along and make modifications.


if ($edit==true //if can edit 
&& //and
!($days[6]->get_year()<$year //not before this year
|| //or
($days[6]->get_year()==$year && $days[6]->get_day_of_year()<$doy) ) //after this year
&& //and
$_SESSION['access_level']>=2) //the access level is 2 or greater

This (while still ugly) is better then having the next person coming along and not understanding what the if statement is doing. Also having a comment before the if statement explaining what the requirements of the conditional are can also help improve readability. The simple act of adding a line stating "Do the following if the edit is enabled, access is 2 or more, and the year is between X and Y" can make this line just that much better.

The only problem with comments like these are putting programmers into the habit of over commenting their code. I have seen plenty of students who have a teacher who punishes students for not commenting. However, I think that with good variable names and simple structures, comments can become useful for the once in a while complicated structure that we need to solve the problem. Jeff Atwood shares my sentiment about comments in code over in a post on his blog Coding Horror.

Tuesday, February 19, 2013

That could be me in x years

Today was the Alumni Symposium at CofC. We had alumni from the computer science department come in and talk to the current students. Each one of them is given 3 minutes to talk about whatever they want to talk about, followed by Q&A from the current students to ask anything they want. The theme for this years event seemed to stem from "personal projects = jobs". Most of the speakers talked about how they got into the position they are now by having personal side projects that they worked on before they started working, and projects that they keep up with even while they have a full time job. The advice only seemed to have one warning, and that was to make sure that if you work on the project outside of work that it does not break the contract. This I think is really good advice, and one that will probably follow me for the rest of my career seeing as how I love working with open source.

Speaking of open source, lately I have been working with Obsidian to get it ready for POSSCON. To understand the code I have been dissecting different parts by changing things here and there to see the changes and understand the control flow of the program. One of the things I'm currently looking into is making the source code IDE independent, and coming up with the tutorials on how to make it work on most of the platforms. That should be an interesting task for me, playing around with ant and writing technical documentation will be a learning experience.

Going back to what I could be doing in the future, I have given a lot of thought into what I want to do as a career path, but I haven't finally decided on anything yet. The main issue that I face is do I want to switch careers halfway down the line, or do I just want to stick to one track? The two tracks that I'm torn between are teaching and software engineering. I love the idea of teaching the next generation, but I feel that without some real world experience (10 years as a software developer or so) that I might not be able to impart extra wisdom to the class (the extra wisdom being non-academic advice). So the way I'm thinking about my future right now is going to work writing software for a few years, and eventually while I work teach on the side. Eventually I would transition from that dual job title and go fully into teaching, the only problem with this is the advice that I have been getting from people. Most of the academics and non-academics that I talk to suggest that I should go straight into teaching and not bother with going into the non-academic sector because of the money. Apparently if I go into the private sector I will get used to a lifestyle, and not want to return to the academia world. I do not know how true this is, but enough people have warned me about it to start putting serious doubts in my head about doing the dual career paths.

Whatever path I decide to take, I still think it would be cool to make the one idea I have had for a while and bring it into fruition. I have always wanted to make a company where the contracts we take on will pay for the bills, while the other half of the company will be a R&D/Open source department. Anyone in the company can take a day a week to work on an open source project (or create one) and put it out there for other people to use. I figure this way the open source software we do use will be improved with the features the company wants, and if people see the work that our workers are capable of they will pay for our services. The other part of the company is the student side. I would love to be able to have students on the team, working on both sides of the company, as well as getting real world experience along the academic experience they are receiving. This is something I really need to put a lot of thought into before my undergrad finishes up, that way I have a clear idea of what I need to do moving forward.

Thursday, February 14, 2013

What's Happening?

For this blogpost we had to grab a recent article from an ACM or IEEE magazine and talk about it. I chose to write about "The Great and Terrible Oz" by Grady Booch from the January/February 2013 edition of IEEE Software.

The article expresses the importance of education of the public when it comes to computers. As technology use increases and gains momentum the public seem to be getting further and further from how software actually works. Everyone from your parents to politicians are losing touch with how technology works, and this is a problem. The further the curtain puts people out of the knowledge the less they understand about technology and the prevalence of "magic" in the system will increase. I have seen this gone awry many a time. People think that they "just get viruses" because lack of knowledge. People pay a prince in Nigeria to be part of the rich. People steal software because "it is already made". Without the proper knowledge people are going to get more and more lost.

The problem with this day and age is that technology is growing faster than the population can adapt. Sure the tech savvy know what is going on, but the rest of the world will blindly go out and buy iPad's because of targeted advertisements. My parents in particular still ask me for computer help from time to time, but I have trained them to a level where they only ask if they really need help. Another thing that I have done to help them be more educated is to send them videos of scams and tutorials of different technologies. They won't be falling for a 419 scam anytime soon because they know a little bit about electronic banking.

I have also noticed that with education the amount of viruses on a given machine approaches zero. Switching my parents from IE to Chrome (admittedly by tricking them into thinking Chrome was IE) drastically reduced the number of phone calls asking for help. Now, I'm not saying that everyone has to do what I did to my parents (that was for my own personal gain), but a little bit of education goes a long way. With acts going through congress like SOPA and PIPA an education of the public would go a long way of protecting the internet and my job security in the future.

Wednesday, February 13, 2013

Squashed?

So I have created a patch that we have now accepted into the trunk repository.

The issue I worked on was that PackageEqualityMethod was not importing the public/package/protected inner classes to be used in the "areEqual" methods. At first I developed a few new methods into the system to try and grab all instances of inner classes in a specified package but at a code review session we talked about how to take the things I learned from creating those methods and how we can modify the code to grab the inner classes, seeing as how it already grabs everything else.

Here is an example of a class that we are trying to grab:

//Demo of program that creates bug
//would import just fine in PackageEqualityMethods
public class Foo{
    private int x;
 
    public method1(){
        //Do stuff
    }
 
    //would never import into PackageEqualityMethods
    public class Bar{
        private int y;
        private method2(){
              //Do stuff
        }
    }
 
    //never needed to be imported because it is private
    private class PrivClass{
        //Private class
    }
}


This class created 2 areEqual methods, one for Foo and one for Bar. The private method does not generate an areEqual method, but we are considering making that enhancement at a later time if public opinion and want is high enough.

Here is part of the prototype code that was used to grab the inner classes and put them in the import statement list:

//Class TestAbstract
 
//Method to grab all internal classes and add them to the imports
public  void addInnerClasses(Class classToScan) {
        Class[] classes = classToScan.getDeclaredClasses();
        for (Class c : classes) {
            String importString = c.getName();
            //The getName will return the name with a “$” to show it was internal to the class, replace with a “.”
            importString = importString.replaceAll("\\$", ".");
            //Check to see if it is a member or an enum, and if the modifier is not private
            if ((c.isMemberClass() || c.isEnum()) && (!Modifier.isPrivate(c.getModifiers()))) {
                //if the class is not already imported
                if (!dynamicImportContains(importString)) {
                    getDynamicImports().add(importString);
                }
            }
        }
    }



We then found the part of code that imports classes into the PackageEqualityMethod, and modified it to create this:

//code segment in Class TestAbstract
boolean shouldBeIgnored = false;
 
//code to check if it is not already in dynamic imports
 
//make sure: not in same package
if (classToImport.getPackage().toString().compareToIgnoreCase(
  classTested.getPackage().toString()) == 0) {
  //Added statement to allow member/enum/local classes.
  if (!(classToImport.isMemberClass() || classToImport.isEnum()
                     || classToImport.isLocalClass())) {
    shouldBeIgnored = true;
  }
}
 
//rest of the class that creates the import statements


The final solution to fix this bug was a single conditional statement added to make sure that inner classes were not being ignored.

All in all I believe that sitting down with the code and running multiple open source programs through it has brought me to a better understanding of how reflection works in Java, as well as a greater understanding of Obsidian. After submitting the patch (diff file) I have also started looking into what coding standards and patch requirements we might implement when we release the software in 5 weeks, but I'll talk about my findings next time.

Monday, February 11, 2013

This Bugs Me

Let me talk about my first bug in Obsidian, but first let me explain what the PackageEqualityMethod is and why it is important.

PackageEqualityMethod is a class generated when creating the framework for testing. It holds equality methods for all the classes in the package that the test engineer will then be able to modify to allow non-primitive classes to have a isEqual method that we then use to check for equality. Normally a test engineer needs to write these out for each class if it is needed, however Obsidian takes care of this by having three levels of equality checking. A GlobalEqualityMethod is currently being developed that will have equality methods for every class in the project. This will be the main location where test engineers put most of their work. Below that is the PackagEqualityMethod. The package equality will by default call the global equality, but if needed the test engineer can go in and change the way equality is checked for that specific package. The third level is at the individual class. This equality method will by default use the package equality methods (which in turn may be using the global if it has not been modified), but can also be modified to test for equality in a different way for just that class.

Currently the issue tasked to me is Issue 6. Issue 6 deals with the building of the PackageEqualityMethods and it missing required import statements. I have been studying the errors with a few open source projects and it appears to be an issue with Nested Classes. The way I found this out was by running obsidian with the  different open source projects and manually fixing the errors. When I fixed the errors I went into the offending class that was not being imported and discovered that all of them had to do with being nested classes and in certain conditions Enum's. While one of the projects I tested had a lot of nested classes inside, most of them were not accessible outside that class (most were private). The reason for the errors was that Obsidian was trying to make an equality method for that class without importing it specifically. An example of a class that produces the bug follows:


 public class Foo{
 private int x;

 public method1(){
 //Do stuff
 }

 public class Bar{private int y;
  private method2(){
  //Do stuff
  }
 }
}

This creates a problem whenever Obsidian creates the equality methods because in the PackageEqualityMethod it does not "import com.foo.bar", it only imports "com.foo". It also creates a problem when two nested enum classes are put into the PackageEqualityMethod. I am currently working on how to change the import creation, but depending on what the group thinks I might change the way Obsidian creates the equality methods to ignore nested classes. The second option feels more like a cheap way of fixing the problem while also hindering the test engineer. If the code has a public nested class you should test it separately as if it was another class.

Tuesday, February 5, 2013

Bug Juice

The team has now come out with a public Google Code Repository that we are using to track our progress as well as the bugs we find and the features/enhancements that we want in the software. Our new timeline is to have all the functionality and bugs mostly done (if not completely done) by POSSCON. This is going to require a lot of effort from the group, especially if we need all the documentation done at the same time. That is why the plan is for Hunter (the main developer currently) to continue to integrate the enhancements while Micah, Laryea, Joanna, and I fix bugs. During the bug fixing we are also assigned secondary tasks. These tasks include adding new bugs and features that we find/want in the issue tracker, as well as working on documentation. Hunter and Laryea are creating a new logo and website, Joanna is extracting the information from the academic paper and putting it into the documentation, Micah (self dubbed King Wiki) is adding what we know about Obsidian and how it runs into the Wiki, while I'll be working on creating guides for how to work on the project with different IDE's and OS's.

The issues that I have been assigned are fixing the generated PackageEqualityMethod's and what they import, as well as implementing the Null and Primitive equality methods at the global level. The first one that I want to tackle is the import issue with PackageEqualityMethod. This task is going to involve finding out how Obsidian currently gathers the required imports, finding common missing import statements and figuring out why Obsidian misses the common missing imports. I'll be spending a good amount of time finding other open source projects written in Java to throw Obsidian at so I can have sufficient data about the missing import statements. Hopefully I'll be able to see common trends with missing imports (an example being that it misses every import statement from com's but not org's) and then quickly come up with a patch.

The other teams all seem to be working hard on their respective projects. I'm really interested in how this class succeeds in reaching their personal goals for this semester. Some teams have picked to work on a lot of bugs while a few will be working on primarily enhancements for their project. I can't wait to see what happens in the next few months.

Sunday, February 3, 2013

Reflections on Open Source in Today's World

What license should we use?

Obsidian is currently using the MIT license, and seeing as how we haven't released it yet we still have an opportunity to change it. Reading which open source license should you use it might become a topic of discussion for our group. The main thing that we would have to discuss is how much freedom people have on the choice of returning changes to the code base back to the main developers. This idea has been coined Copyleft. Copyleft is the idea that instead of copyrighting the software to not allow anyone to modify or redistribute the product you are allowing people to work on it as if it was in the public domain (which some developers do), but still give someone ownership. The main owner of the software is allowing you access to make modifications, but when you publish your modifications it must abide by the same rules of modification that you received from the source.

The different licenses have specific uses and rules to them, an example of this would be the ability to use a copyleft license product (or parts of code) in a proprietary system. Licenses like this include the Microsoft Reciprocal License and the Mozilla Public License. If you want to allow unlimited freedom with the software the MIT and the 3-clause BSD license have no copyleft and no discussion of patents. Both of these licenses are out there for anyone to tinker with and use in any fashion they want because of the academic ties that they both grew from. If you want to make sure that every modification is published, and everyone who uses components gives their users information about the component the GPL licenses are most likely the best ones. Another factor we could possibly use is to change the IP/license depending on the entity using it. We could have specific licenses depending on if this is for a company, student, or personal use. We will have to discuss as a group what type of license we like the most.


What hosting should we use?

Obsidian has been using CofC's SVN server during the development that Hunter has already put into it. Now we need to pick what infrastructure we are going to want, as well as who is going to be hosting it. The first one that comes to mind is keeping it on the SVN at CofC. The only downside that I can see with this is that if we wanted to add people to the project we would have to develop a portal for a way of giving usernames/passwords to potential contributors. The other point I would have to single out is the fact that there are already plenty of people who offer this service, why try to confuse people with a new system/repository when they most likely already used to the other repository services. 

The main repository services that open source developers like to use include Sourceforge, GitHub, Google Code, Gitorious, and Bitbucket. They all allow you to add your source code as well as a download page to allow people easy access to the binary files. Each has their strengths and uses, but we are thinking of going with Google Code. We liked it for the built in Wiki and issue tracker as well as the capability to integrate Google Groups to allow for easy management of the contributors. The one benefit that I like from GitHub is the idea of a very social environment, and by that I mean that it is very easy for someone to develop their own branch and create a patch that we could then integrate with the base. This question will most likely just come down to personal preferences of everyone in the group.