How statistics are calculated
We count how many offers each candidate received and for what salary. For example, if a Data QA developer with JUnit with a salary of $4,500 received 10 offers, then we would count him 10 times. If there were no offers, then he would not get into the statistics either.
The graph column is the total number of offers. This is not the number of vacancies, but an indicator of the level of demand. The more offers there are, the more companies try to hire such a specialist. 5k+ includes candidates with salaries >= $5,000 and < $5,500.
Median Salary Expectation – the weighted average of the market offer in the selected specialization, that is, the most frequent job offers for the selected specialization received by candidates. We do not count accepted or rejected offers.
Trending Data QA tech & tools in 2024
Data QA
What is Data Quality
A data quality analyst maintains an organisation’s data so that they can have confidence in the accuracy, completeness, consistency, trustworthiness, and availability of their data. DQA teams are in charge of conducting audits, defining the data quality standards, spotting outliers, and fixing the flaws, and play a key role at all stages in the data lifecycle. Without DQA work, strategic plans will fail, operations will go awry, customers will leave, and organisations will face substantial financial losses, as well as a lack of customer trust and potential legal repercussions due to poor-quality data.
This is a job that has changed as much as the hidden infrastructure that transforms data into insight and then powers the apps that we all use. I mean, it’s changed a lot.
Data Correctness/Validation
This is the largest stream of all the tasks. When we talk about data correctness, we should be asking: what does correctness mean to you, for this dataset? Because it would be different for every dataset and every organisation. The commonsense interpretation is that it must be what your end user (or business) wants from the dataset. Or what would be an expected result of the dataset.
We can obtain this just by asking questions, or else reading through the list of requirements. Here are some of the tests we might run, in this stream:
Finding Duplicates — nobody wants this in their data.
– Your data contains unique/distinct values in that column/field. Will the returned value be a unique/distinct value in that column/field?
– Any value that can be found in your data is returned.
Data with KPIs – If data has any columns we can sum, min or max on it’s called a key performance indicator. So basically any models which are mostly numeric/int column. eg: Budget, Revenue, Sales etc. If there is data comparison between two datasets then below tests applies:
– Comparing counts between two datasets — get the difference in count
– Compare the unique/distinct values and counts for columns – find out which values are not present in either of the datasets.
– Compare the KPIs between two datasets and get the percentage difference between them.
– Replace missing values – missing in any one of the datasets with primary or composite primary key. This can be done in a data source that does not have primary key too.
– Perform the metrics by segment for the individual column value — that can help you determine what might be going wrong if the count of values in the Zoopla-side doesn’t match the count on the Rightmove-side or if some of the values are missing.
Data Freshness
This is an easy set. How do we know if the data is fresh?
An obvious indication here is to check if your dataset has a date column, in which case, you just check the max date. Another one is, when the data was pulled into a particular table, all of this can be converted into a very simple automated checks, which we might talk about in a later blog entry.
Data Completeness
This could be an intermediate step in addition to data correctness, but how do we know to get there if the space of answers is complete?
To do this test, check if any column has all values null in it perhaps that’s okay, but most of the time it’s bad news.
Another test would be one-valuedness: whether everywhere on the column all values are the same, probably in some cases that would be a fine result, but probably in other cases that would be something we’d rather look into.
What are Data Quality Tools and How are They Used?
Data quality tools are used to improve, or sometimes automate, many processes required to ensure that data stays fit for analytics, data science, and machine learning. For example, such tools enable teams to evaluate their existing data pipelines, identify bottlenecks in quality, and even automate many remediation steps. Examples of activities relating to guaranteeing data quality include data profiling, data lineage, and data cleansing. Data cleansing, data profiling, measurement, and visualization tools can be used by teams to ‘understand the shape and values of the data assets that have been acquired – and how they are being collected’. These tools will call outliers and mixed formats. In the data analytics pipeline, data profiling acts as a quality control gate. And each of these are data management chores.
Where is JUnit used?
The Bug Squasher
- Developers use JUnit to vanquish pesky code critters, ensuring their masterpieces behave as expected, without any mischievous bugs throwing a party in the logic.
The Feature Guardian
- Every time a shiny new feature pops up, JUnit gets down to work, double-checking the newbie plays nice with the old-timers and doesn’t start any code feuds.
The Regression Wrangler
- When coders say, "I just tweaked that," JUnit steps in like a time-traveling sheriff, making sure those ‘innocent’ tweaks didn’t unravel the fabric of code-time continuum.
The Continuous Integration Sidekick
- Integrated with build tools, JUnit swings into action with every ‘commit,’ ensuring the codebase remains as solid as a rock, even when the code contributions are flying in hot.
JUnit Alternatives
TestNG
TestNG is a testing framework inspired by JUnit but with new functionalities. It is designed for more complex testing scenarios like data-driven testing.
@Test(dataProvider = "dataMethod")
public void testMethod(String data) {
System.out.println(data);
}
- More flexible and powerful annotations.
- Allows parallel execution of tests.
- Supports data-driven testing natively.
- Learning curve for JUnit users.
- Less concise than JUnit for simple tests.
- Smaller community compared to JUnit.
Spock
Spock is a testing and specification framework for Java and Groovy applications, combining features of JUnit, mocking frameworks, and BDD style testing.
def "length of Spock's and his friends' names"() {
expect:
name.size() == length
where:
name | length
"Spock" | 5
"Kirk" | 4
}
- Readable specification language.
- Elegant Groovy syntax for tests.
- Powerful mocking and stubbing.
- Requires Groovy knowledge.
- Less Java idiomatic for Java purists.
- Not as widespread use as JUnit.
Mockito
Mockito is a mocking framework used for unit testing in Java applications. It simplifies the creation of mock objects and is often used alongside JUnit.
@Test
public void testQuery() {
List mockedList = mock(List.class);
when(mockedList.size()).thenReturn(100);
assertEquals(100, mockedList.size());
}
- Intuitive and simple API for mocking.
- Flexible argument matchers and callbacks.
- No boilerplate code to create mocks.
- Mocking is not a replacement for actual unit testing.
- Sometimes considered as overuse and may lead to brittle tests.
- Tests with lots of mocks can be hard to read and maintain.
Quick Facts about JUnit
JUnit: A Time-Traveling Bug Zapper
Picture it: the year 1997, when dinosaurs (I mean, old computers) roamed the Earth, and Kent Beck & Erich Gamma unleashed a tiny digital critter known as JUnit into the wild! Unlike T-Rex, it helped programmers by munching on nasty bugs instead of developers. This automated testing framework for Java became the proverbial fly-swatter for code, changing the game before most of our cell phones got smart.
JUnit 4: The Annotation Evolution
Fast forward to 2006, and bam! JUnit 4 pops out with those fancy annotations like @Test, waving goodbye to the primitive era of prefixing your test methods with 'test'. With this upgrade, JUnit said, "Let there be flexibility!" and coders everywhere could organize their bug hunts far more creatively. Behold an example from the annals of this great era:
@Test
public void whenCheckingForTheExistanceOfAUser_thenCorrect() {
assertThat(userService.doesUserExist("Bob")).isTrue();
}
JUnit 5: The Juggernaut's Latest Jive
Just when you thought it couldn't get better, 2017 brought us JUnit 5 - codename: Jupiter. This version came out doing the moonwalk while supporting Java 8 Lambda expressions, and let's not forget about the new testing interfaces that make it a breeze to write dynamic tests. It's like having a Swiss Army knife for squashing bugs. Here's a taste of Lambda flavor in JUnit 5:
@TestFactory
CollectiondynamicTestsFromCollection() {
return Arrays.asList(
dynamicTest("1st dynamic test", () -> assertTrue(true)),
dynamicTest("2nd dynamic test", () -> assertEquals(4, 2 * 2))
);
}
What is the difference between Junior, Middle, Senior and Expert JUnit developer?
Seniority Name | Years of Experience | Average Salary (USD/year) | Quality-wise | Responsibilities & Activities |
---|---|---|---|---|
Junior | 0-2 | 40,000-70,000 | Learning Curve |
|
Middle | 2-5 | 70,000-100,000 | Consistent Quality |
|
Senior | 5-10 | 100,000-140,000 | High Quality Standards |
|
Expert/Team Lead | 10+ | 140,000+ | Exceptional Quality |
|
Top 10 JUnit Related Tech
Java
Where there is JUnit, there's Java, waving at you like an old friend or that one over-caffeinated colleague. Java is the bedrock of JUnit—literally, JUnit is a Java thing. If you're tiptoeing around JUnit without Java know-how, it's like bringing a spoon to a swordfight. Remember, Java and JUnit go together like coffee and Mondays.
// A basic Java class example
public class HelloWorld {
public String greet() {
return "Hello, World!";
}
}Maven or Gradle
Build tools are like the backstage crew of a rock concert; without them, there's no music—just awkward silence. Maven and Gradle are the roadies that get your JUnit tests staged and ready to rock. They'll manage your dependencies, making sure you're not left with a JAR jigsaw puzzle from hell.
// Maven dependency for JUnit
<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<version>4.12</version>
<scope>test</scope>
</dependency>Mockito
Mockito: the wizard of the testing world. It conjures up the illusions of your dependencies, so you can test your code in blissful isolation. Think of it as your code's imaginary friend that helps it get through tough times. Creating a mock is easier than convincing my grandma to use Facebook.
// Mocking a dependency with Mockito
List mockedList = mock(List.class);
when(mockedList.get(0)).thenReturn("mockedValue");Hamcrest
This one is the matcher extraordinaire for assertions. It's like going to a tailor that ensures everything fits your testing expectations perfectly. It'll help you write matcher expressions that read like English, so your tests can sound like Shakespeare while still slaying bugs.
// Using Hamcrest for assertions
assertThat("Hello, World!", containsString("World"));Spring Framework
If your app is part of the cool Spring crowd, you'll be testing in the Spring context. It's like a fancy garden party for your code, and JUnit is the strict butler ensuring everyone behaves. Get familiar with Spring Boot Test and mock away with Spring's own MockMvc.
// Spring MVC test snippet
@WebMvcTest(YourController.class)
public class YourControllerTest {
@Autowired
private MockMvc mockMvc;
}JUnit 5
JUnit 5, the latest hotshot in the testing suite. It skipped leg day but brought new features to the arms race, like dynamic tests and annotations that make JUnit 4 look like your dad trying to use Snapchat. Mastery is bar-setting; it tells people you don't just test, you test with style.
// JUnit 5 example with new annotations
@Nested
@DisplayName("When X is tested")
class XTest {
@Test
@DisplayName("should Y")
void testY() {
assertTrue(true);
}
}IntelliJ IDEA
IntelliJ is that genius kid in class who did everyone's homework. The IDE that knows more about your code than you do and isn't shy about it. It's got JUnit testing baked in like chocolate chips in a cookie—with JUnit code assistance, running, and debugging tests become a breeze.
// Shortcut to run tests in IntelliJ
// Use the keyboard shortcut: Shift + F10 (on Windows/Linux)
// or Control + R (on macOS)Git
Git, the time traveler's tool—essential for when you need to go back to when your tests actually passed. Think of it like a safety net for when you're walking the tightrope of testing. With branches, commits, and pull requests, it's like the DVR for your JUnit episodes.
REST-assured
For the JUnit test conductor dealing with RESTful APIs, REST-assured is like that trusty baton. It makes verifying the symphony of HTTP requests easy, like Simon Cowell judging a reality show. Pair it with JUnit to assert your APIs are well-behaved citizens of the internet.
// REST-assured with JUnit example
@Test
public void whenGetRequestToUsers_thenOK() {
when().
get("/users").
then().
statusCode(200);
}Jenkins
Last but not least, Jenkins – the old but gold continuous integration butler. It stands ready to serve your JUnit tests on a silver platter, automating your builds like a five-star robot chef. Hook it up with your code repository, and it'll spit out test results every time you commit like a jackpot slot machine.
// Jenkins pipeline script for JUnit tests
pipeline {
agent any
stages {
stage('Test') {
steps {
sh 'mvn clean test'
}
}
}
}