The first four steps on how to start a Software Quality Team for MySQL Cluster from ground up
On June 2018, the first ever re-organization of the Oracle MySQL Cluster group occurred. As in other companies, there’s a moment when a larger group is broken down into smaller teams each owning a part of the overall process/code-base/tasks. While in some cases the new teams are just a formalization of an already-existent informal structure, in others a new team emerges to answer specific long-term needs.
In MySQL Cluster group, such team was the Quality Team which I have become responsible for. The long-term needs were getting the grips on testing infrastructure, ensure reliable test execution and reporting, and evolve current infrastructure to support developers in creating better tests.
As soon as the team was formed, the first challenge was to define the first steps to start this team. Here are the first four steps our team did.
1. Define Team OKRs
OKRs originate from Andy Groove’s “Management By Objectives (MBO)” methodology . In a simplistic way, one starts by defining high-level objectives and specific measurable key results that are linked to those objectives. “Measurable” is critical to keep things on track. Accomplishing all key results ensures achieving the objective.
Team effectiveness depends on everyone understanding what are the right things to work on, what is important, and it’s not. Clear objectives are paramount to ensure this as they set the direction. Key results set the milestones on that direction. Together, objectives and key results, act as a powerful framework to simplify alignment on the tasks and the priorities the team have to deliver on a daily basis.
The definition of our team’s objectives for the next three months was quite easy as they were long-standing “issues” that needed to be solved. The objectives were:
- Regain the reins of test suite execution & reporting: ensure test suite execution is done reliably and accurate results are automatically reported ensuring regressions are found out early in the development process.
- Deliver automated performance test suite: finalize the on-going effort to create an automated performance regression test suite used to ensure new releases do not exhibit unjustified performance that is worse than a previous version.
- Deliver smoke test suite: define minimal automated test suite that can be used both to quickly validate team’s work-logs  and used in package verification testing.
- Support team work logs: identify test infrastructure needs and develop required functionality allowing developers to produce maintainable tests, and implement automated analyses ensuring
More details about each of the objectives and each defined key results will be addressed in a different Medium story.
2. Agree on a work methodology
Different companies follow different methodologies and/or variations of those methodologies — hence it’s important for all team members to be clear on the expectations about how work is going to be organized, assigned, and evaluated.
I had the opportunity of working in large governmental institutions using well-established waterfall methodologies, companies following scrum-inspired agile methodologies, and companies following Scrum “by the book”. Each of these methodologies worked in their own environment and there were strong reasons for why things were as they were. However, I felt being most effective when using Scrum .
Because the majority of the team was very junior, most having no other previous work experience, ensuring frequent communication is of paramount importance to ensure people speak up on their progress and any issue can be addressed quickly. Scrum fosters team involvement in developed tasks (sprint grooming addresses tasks that are no clear enough, and estimation poker ensures everyone has a word on how long each task should take). Continuous improvement is part of the process via Scrum Retrospectives. Finally, Scrum obliges discipline in planning, execution, and follow up of tasks — and discipline means predictability.
The following rules were agreed: sprints last two weeks starting on a Monday and finishing on a Friday; daily sprint meetings are held to discuss any possible blocks/issues; next sprint tasks must be defined until the end of sprint first week (allowing one week for sprint grooming); and on Friday, at end of sprint, all three end-of-sprint, sprint retrospective meeting, and sprint planning meetings are done.
3. Perform a premortem
A project’s premortem is a managerial strategy proposed by Gary Klein  promoting a discussion to identify issues that can cause a project to fail by assuming the project has failed.
Two goals were in mind when doing a premortem. First, ensure that everyone on the team has a voice about his concerns/fears/doubts on achieving the goals and key-results. Second, to promote a positive discussion on how to address raised concerns by defining specific actions.
From the initially planned half-day session we ended up taking a full day to allow time for all issues to be discussed. Because we’re a remote team https://sketchboard.me/ was used. Over 30 issues were anonymously added in real-time. In round-robin manner, each team element picked a topic that was later discussed in group. Different interpretations were discussed and action items were defined. Issues similar to existent ones were grouped we already discussed issues and action items reviewed to ensure all issue aspects were covered.
Out of all issues, three big topics surfaced:
- Failure on effort estimation, gathering 1/3 of all issues, due to team’s current lack of knowledge on the infrastructure and tools. This was addressed by the creation of small stories, defining the required changes (files/classes/etc.), and ensuring that everyone votes on an estimate.
- Lack of communication/failure to communicate blockers, that hindering delivery of work. This was addressed by maintaining daily scrum meetings to raise these concerns, and immediately follow up on each blocker.
- Lack of support when key-people are away, due to illness or holidays, that could clarify issues or unblock decisions. This was addressed by finding stand-ins for the “Scrum Master”, “Technical Lead” and “Product Owner” roles.
Other issues ranged from “will be a tester team for other teams?” to current infrastructure concerns, each addressed with specific actions.
4. Start first sprint
Nothing best to prove that things work than trying them out. That came in two parts: planning and execution.
Planning the first sprint is hard, specially if you take the role of Product Owner, Scrum Master, and you’re the most experience element of the Development Team. The first difficulty is to identify bounded tasks that address the current issues and are aligned with the different key results. The second difficulty is to put those tasks into words containing the right context for others to pick up and do it. These difficulties are common to people that transition from individual contributors to managers. As individual contributors you know the problems and the pieces of code that need to be changed, tested, and deployed. As a manager the challenge is to coach those people to know all those things, ensure they can do it autonomously, and guarantee they have all the resources to do their best work. It takes time and learning.
Executing the first sprint is when you do the reality check of all the things you have agreed with your team. We started with Sprint #0 allowing some team members to close some pending task and others to start already with our team tasks. This Sprint #0 was used to get the team accustomed to the routine of the daily meetings and to push them to communicate with others. It was also during this time we set our own JIRA with an initial set of backlog tasks and the boards to be used in Daily Scrum Meetings. In Sprint #1 all the team members were 100% dedicated on the team’s tasks and the execution was a disaster (and an expected one)— from the 36 story points defined only 8 got delivered, all transitioning to the next sprint. Lack of delivery in Sprint #1 solved the planning difficulties for Sprint #2. In Sprint #2, delivery greatly improved to 22 story points. After Sprint #3 team got a good feeling about how the “mechanics” of the methodology and a little bit more of understanding of the challenges so focus became on improving people knowledge.
Is a team done with these first steps? Not by far… these are the first steps on a long road of challenges, good and bad decisions, and great opportunity to improve. Subsequent stories will discuss those.
What is MySQL Cluster?
MySQL Cluster is an open-source distributed database combining linear scalability and high availability that runs on commodity hardware. It provides in-memory real-time access with transactional consistency across automatically partitioned and distributed data sets. It is designed for mission critical applications . MySQL Cluster includes the standard MySQL Database server for easiness of integration with existent applications supporting SQL.
Check out MySQL Cluster at: https://www.mysql.com/products/cluster/
About the author
Tiago L. Alves is a Senior Software Engineer at Oracle MySQL Cluster. During the past 6 years has been working on software quality automation in different companies (Talkdesk, Microsoft, and OutSystems), and holds a PhD in Software Product Quality Metrics in partnership with the Software Improvement Group (SIG) in The Netherlands and University of Minho in Portugal.
 Management By Objectives (MBO) is described in “High Output Management” by Andrew S. Grove. This was made popular by John Doerr with his work at Google and has been described in “How Google Works” by Eric Schmidt Jonathan Rosenberg, and later in “Measure What Matters” by John Doerr.
 “MySQL Worklogs are design specifications for changes that may define past work, or be considered for future development.” https://dev.mysql.com/worklog/
 “Performing a Project Premortem” by Gary Klein, Harvard Business Review http://www.drillscience.com/DPS/Project%20Pre-Mortem%20HBR.pdf also described in “Thinking, Fast and Slow” by Daniel Kahneman