One Year Later, Still Struggling with Unit vs Integration vs E2E testing - javascript

I recently watched a number of talks from the AssertJS conference (which I highly recommend), among them #kentcdodds "Write Tests, Not Too Many, Mostly Integration." I've been working on an Angular project for over a year, have written some unit tests, and just started playing with Cypress, but I still feel this frustration around integration tests, and where to draw the lines. I'd really love to talk to some pro who does this day in and day out, but I don't know any where I work. Since I'm tired of not being able to figure this out, I thought I'd just ask the world here, cause you all are fantastic.
So in Angular (or React or Vue, etc), you have component code, and then you have the HTML template, and usually they interact in some way. The component code has functions in it that can be unit tested, and that part I'm ok with.
Where I haven't gotten things straight in my mind is, do you call it an integration test when you're testing how a component function changes the UI? If you're testing that kind of thing, should that be done just in E2E tests? Because Angular/Jasmine(or Jest) lets you do this kind of thing, referencing the UI:
const el = fixture.debugElement.queryAll(By.css('button'));
expect(el[0].nativeElement.textContent).toEqual('Submit')
But does that mean you should? And if you do, then do you not cover that in your E2E tests?
And regarding integration with things like services, how far do you go with integrating? If you mock the actual HTTP call, and just test that it would get called with the right functions, is that an integration test, or is it still a unit test?
To sum up, I intuitively know what I need to test to have confidence that things are working as they should, I'm just not sure how to discern when something requires all three kinds of tests or not.
I know this is getting long, but here's my example app:
There's a property called hasNoProducts that is set after a product is chosen and data is returned from the server (or not if there is none). If hasNoProducts is true, UI (through an *ngIf) shows that "Sorry" message. If false, then other selections become available. Depending on the product picked, those options change.
So I know I can write a unit test and mock the HTTP request so that I can test that hasNoProducts is set correctly. But then, I want to test that the message is displayed, or that the additional options are displayed. And if there is data, test that switching the product changes the data in the other lists that would subsequently show on screen. If I do that using Angular/Jasmine, is it an integration test since I'm "integrating" component and template? If not, then what would be an integration test?
I could keep asking questions, but I'll stop there in the hopes that someone has read this far and has some insight. Again, I've read tons of articles, watched tons of videos and done tutorials, but every time I sit down to apply to a real project, I get stuck on things like this, and I want to get past this! Thanks in advance.

What distinguishes unit-tests and integration-tests (and then subsystem-tests and system-tests) is the goal that you want to achieve with the tests.
The goal of unit-testing is to find those bugs in small pieces of code that can be found if these pieces of code are isolated. Note that this does not mean you truly must isolate the code, but it means your focus is the isolated code. In unit-testing, mocking is very common, since it allows to stimulate scenarios that otherwise are hard to test, or it speeds up build and execution times etc., but mocking is not mandatory: For example, you would not mock calls to a mathematical sin() function from the standard library, because the sin() function does not keep you from reaching your testing goals. But, leaving the sin() function in does not turn these tests into integration tests. Strictly speaking, you could even have unit-tests where some real network accesses take place (if you are too lazy to mock the network access), but due to the non-determinism, delays etc. these unit tests would be slow and unreliable, which means they would simply not be well suited to specifically find the bugs in the isolated code. That's why everybody says that "if there is some real network access, it is not a unit-test", which is not formally but practically correct.
Since in unit-testing you intentionally only focus on the isolated code, you will not find bugs that are due to misunderstandings about interactions with other components. If you mock some depended-on-component, then you implement these mocks based on your understanding of how the other component behaves. If your understanding is wrong, your mock implementations will reflect your wrong understanding, and your unit-tests will succeed, although in the integrated system things will break. That is not a flaw of unit-testing, but simply the reason why there are other test levels like integration testing. In other words, even if you do unit-testing perfectly, there will unavoidably remain some bugs that unit-testing is not even intending to find.
Now, what are integration tests then? They are defined by the goal to find bugs in the interactions between (already tested) components. Such bugs can, for example, be due to mutual misconceptions of the developers of the components about how an interface is meant to work. For example, in the case of a library component B that is used from A: Does A call functions from the right component B (rather than from C), do the calls happen while B is already in a proper state (B might not be initialized yet or in error state), do the calls happen in the proper order, are the arguments provided in the correct order and have values in the expected form (e.g. zero based index vs. one based index?, null allowed?), are the return values provided in the expected form (returned error code vs. exception) and have values in the expected form? This is for one integration scenario - there are many others, like, components exchanging data via files (binary or text? which end-of-line marker: unix, dos, ...?, ...).
There are many possible interaction errors. To find them, in integration testing you integrate the real components (the real A and the real B, no mocks, but possibly mocks for other components) and stimulate them such that the different interactions actually take place - ideally in all interesting ways, like, trying to force some boundary cases in the interaction (exchanged file is empty, ...). Again, just the fact that the test operates on a software where some components are integrated does not make it an integration test: Only if the test is specifically designed to initiate interactions such that bugs in these interactions become apparent, then it is an integration test.
Subsystem tests (which are the next level) then focus, again, on the remaining bugs, that is, those bugs which neither unit-testing nor integration testing intend to find. Examples are requirements on the component C that were not considered when C was decomposed into A and B, or, if C is built using some outdated version of A where some bug was still in. However, when climbing up from unit-testing via integration testing to subsystem testing and above, it is a challenge to stay focused: Only to have tests for bugs that could not have been found before, and not to, say, repeat unit-tests on subsystem level.

Related

Why do we use mockImplementation instead of using actual function which we need to test? [duplicate]

What is Mocking?                                                                                                    .
Prologue: If you look up the noun mock in the dictionary you will find that one of the definitions of the word is something made as an imitation.
Mocking is primarily used in unit testing. An object under test may have dependencies on other (complex) objects. To isolate the behaviour of the object you want to test you replace the other objects by mocks that simulate the behaviour of the real objects. This is useful if the real objects are impractical to incorporate into the unit test.
In short, mocking is creating objects that simulate the behaviour of real objects.
At times you may want to distinguish between mocking as opposed to stubbing. There may be some disagreement about this subject but my definition of a stub is a "minimal" simulated object. The stub implements just enough behaviour to allow the object under test to execute the test.
A mock is like a stub but the test will also verify that the object under test calls the mock as expected. Part of the test is verifying that the mock was used correctly.
To give an example: You can stub a database by implementing a simple in-memory structure for storing records. The object under test can then read and write records to the database stub to allow it to execute the test. This could test some behaviour of the object not related to the database and the database stub would be included just to let the test run.
If you instead want to verify that the object under test writes some specific data to the database you will have to mock the database. Your test would then incorporate assertions about what was written to the database mock.
Other answers explain what mocking is. Let me walk you through it with different examples. And believe me, it's actually far more simpler than you think.
tl;dr It's an instance of the original class. It has other data injected into so you avoid testing the injected parts and solely focus on testing the implementation details of your class/functions.
Simple example:
class Foo {
func add (num1: Int, num2: Int) -> Int { // Line A
return num1 + num2 // Line B
}
}
let unit = Foo() // unit under test
assertEqual(unit.add(1,5),6)
As you can see, I'm not testing LineA ie I'm not validating the input parameters. I'm not validating to see if num1, num2 are an Integer. I have no asserts against that.
I'm only testing to see if LineB (my implementation) given the mocked values 1 and 5 is doing as I expect.
Obviously in the real word this can become much more complex. The parameters can be a custom object like a Person, Address, or the implementation details can be more than a single +. But the logic of testing would be the same.
Non-coding Example:
Assume you're building a machine that identifies the type and brand name of electronic devices for an airport security. The machine does this by processing what it sees with its camera.
Now your manager walks in the door and asks you to unit-test it.
Then you as a developer you can either bring 1000 real objects, like a MacBook pro, Google Nexus, a banana, an iPad etc in front of it and test and see if it all works.
But you can also use mocked objects, like an identical looking MacBook pro (with no real internal parts) or a plastic banana in front of it. You can save yourself from investing in 1000 real laptops and rotting bananas.
The point is you're not trying to test if the banana is fake or not. Nor testing if the laptop is fake or not. All you're doing is testing if your machine once it sees a banana it would say not an electronic device and for a MacBook Pro it would say: Laptop, Apple. To the machine, the outcome of its detection should be the same for fake/mocked electronics and real electronics. If your machine also factored in the internals of a laptop (x-ray scan) or banana then your mocks' internals need to look the same as well. But you could also use a MacBook that no longer works.
Had your machine tested whether or not devices can power on then well you'd need real devices.
The logic mentioned above applies to unit-testing of actual code as well. That is a function should work the same with real values you get from real input (and interactions) or mocked values you inject during unit-testing. And just as how you save yourself from using a real banana or MacBook, with unit-tests (and mocking) you save yourself from having to do something that causes your server to return a status code of 500, 403, 200, etc (forcing your server to trigger 500 is only when server is down, while 200 is when server is up.
It gets difficult to run 100 network focused tests if you have to constantly wait 10 seconds between switching over server up and down). So instead you inject/mock a response with status code 500, 200, 403, etc and test your unit/function with a injected/mocked value.
Be aware:
Sometimes you don't correctly mock the actual object. Or you don't mock every possibility. E.g. your fake laptops are dark, and your machine accurately works with them, but then it doesn't work accurately with white fake laptops. Later when you ship this machine to customers they complain that it doesn't work all the time. You get random reports that it's not working. It takes you 3 months to figure out that the color of fake laptops need to be more varied so you can test your modules appropriately.
For a true coding example, your implementation may be different for status code 200 with image data returned vs 200 with image data not returned. For this reason it's good to use an IDE that provides code coverage e.g. the image below shows that your unit-tests don't ever go through the lines marked with brown.
image source
Real world coding Example:
Let's say you are writing an iOS application and have network calls.Your job is to test your application. To test/identify whether or not the network calls work as expected is NOT YOUR RESPONSIBILITY . It's another party's (server team) responsibility to test it. You must remove this (network) dependency and yet continue to test all your code that works around it.
A network call can return different status codes 404, 500, 200, 303, etc with a JSON response.
Your app is suppose to work for all of them (in case of errors, your app should throw its expected error). What you do with mocking is you create 'imaginary—similar to real' network responses (like a 200 code with a JSON file) and test your code without 'making the real network call and waiting for your network response'. You manually hardcode/return the network response for ALL kinds of network responses and see if your app is working as you expect. (you never assume/test a 200 with incorrect data, because that is not your responsibility, your responsibility is to test your app with a correct 200, or in case of a 400, 500, you test if your app throws the right error)
This creating imaginary—similar to real is known as mocking.
In order to do this, you can't use your original code (your original code doesn't have the pre-inserted responses, right?). You must add something to it, inject/insert that dummy data which isn't normally needed (or a part of your class).
So you create an instance the original class and add whatever (here being the network HTTPResponse, data OR in the case of failure, you pass the correct errorString, HTTPResponse) you need to it and then test the mocked class.
Long story short, mocking is to simplify and limit what you are testing and also make you feed what a class depends on. In this example you avoid testing the network calls themselves, and instead test whether or not your app works as you expect with the injected outputs/responses —— by mocking classes
Needless to say, you test each network response separately.
Now a question that I always had in my mind was: The contracts/end points and basically the JSON response of my APIs get updated constantly. How can I write unit tests which take this into consideration?
To elaborate more on this: let’s say model requires a key/field named username. You test this and your test passes.
2 weeks later backend changes the key's name to id. Your tests still passes. right? or not?
Is it the backend developer’s responsibility to update the mocks. Should it be part of our agreement that they provide updated mocks?
The answer to the above issue is that: unit tests + your development process as a client-side developer should/would catch outdated mocked response. If you ask me how? well the answer is:
Our actual app would fail (or not fail yet not have the desired behavior) without using updated APIs...hence if that fails...we will make changes on our development code. Which again leads to our tests failing....which we’ll have to correct it. (Actually if we are to do the TDD process correctly we are to not write any code about the field unless we write the test for it...and see it fail and then go and write the actual development code for it.)
This all means that backend doesn’t have to say: “hey we updated the mocks”...it eventually happens through your code development/debugging. ‌ّBecause it’s all part of the development process! Though if backend provides the mocked response for you then it's easier.
My whole point on this is that (if you can’t automate getting updated mocked API response then) human interaction is likely required ie manual updates of JSONs and having short meetings to make sure their values are up to date will become part of your process
This section was written thanks to a slack discussion in our CocoaHead meetup group
Confusion:
It took me a while to not get confused between 'unit test for a class' and the 'stubs/mocks of a class'.
E.g. in our codebase we have:
class Device
class DeviceTests
class MockDevice
class DeviceManager
class Device is the actual class itself.
class DeviceTests is where we write unit-tests for the Device class
class MockDevice is a mock class of Device. We use it only for the purpose of testing. e.g. if our DeviceManager needs to get unit-tested then we need dummy/mock instances of the Device class. The MockDevice can be used to fulfill the need of dummy/mock instances.
tldr you use mock classes/objects to test other objects. You don't use mock objects to test themselves.
For iOS devs only:
A very good example of mocking is this Practical Protocol-Oriented talk by Natasha Muraschev. Just skip to minute 18:30, though the slides may become out of sync with the actual video 🤷‍♂️
I really like this part from the transcript:
Because this is testing...we do want to make sure that the get function
from the Gettable is called, because it can return and the function
could theoretically assign an array of food items from anywhere. We
need to make sure that it is called;
There are plenty of answers on SO and good posts on the web about mocking. One place that you might want to start looking is the post by Martin Fowler Mocks Aren't Stubs where he discusses a lot of the ideas of mocking.
In one paragraph - Mocking is one particlar technique to allow testing of a unit of code with out being reliant upon dependencies. In general, what differentiates mocking from other methods is that mock objects used to replace code dependencies will allow expectations to be set - a mock object will know how it is meant to be called by your code and how to respond.
Your original question mentioned TypeMock, so I've left my answer to that below:
TypeMock is the name of a commercial mocking framework.
It offers all the features of the free mocking frameworks like RhinoMocks and Moq, plus some more powerful options.
Whether or not you need TypeMock is highly debatable - you can do most mocking you would ever want with free mocking libraries, and many argue that the abilities offered by TypeMock will often lead you away from well encapsulated design.
As another answer stated 'TypeMocking' is not actually a defined concept, but could be taken to mean the type of mocking that TypeMock offers, using the CLR profiler to intercept .Net calls at runtime, giving much greater ability to fake objects (not requirements such as needing interfaces or virtual methods).
Mock is a method/object that simulates the behavior of a real method/object in controlled ways. Mock objects are used in unit testing.
Often a method under a test calls other external services or methods within it. These are called dependencies. Once mocked, the dependencies behave the way we defined them.
With the dependencies being controlled by mocks, we can easily test the behavior of the method that we coded. This is Unit testing.
What is the purpose of mock objects?
Mocks vs stubs
Unit tests vs Functional tests
Mocking is generating pseudo-objects that simulate real objects behaviour for tests
The purpose of mocking types is to sever dependencies in order to isolate the test to a specific unit. Stubs are simple surrogates, while mocks are surrogates that can verify usage. A mocking framework is a tool that will help you generate stubs and mocks.
EDIT: Since the original wording mention "type mocking" I got the impression that this related to TypeMock. In my experience the general term is just "mocking". Please feel free to disregard the below info specifically on TypeMock.
TypeMock Isolator differs from most other mocking framework in that it works my modifying IL on the fly. That allows it to mock types and instances that most other frameworks cannot mock. To mock these types/instances with other frameworks you must provide your own abstractions and mock these.
TypeMock offers great flexibility at the expense of a clean runtime environment. As a side effect of the way TypeMock achieves its results you will sometimes get very strange results when using TypeMock.
I would think the use of the TypeMock isolator mocking framework would be TypeMocking.
It is a tool that generates mocks for use in unit tests, without the need to write your code with IoC in mind.
If your mock involves a network request, another alternative is to have a real test server to hit. You can use a service to generate a request and response for your testing.

What is the way to write tests against a 3rd party software like redis

I'm not sure if this is the right forum to ask this question if not kindly point me in the right direction.
I wanted to create a library/client for a 3rd party tool, which is similar to redis. And for the unit/integration tests, I see that in predis library, they have tests which directly interacts with a running redis instance and there are tests which make use of mocks.
So my question is that, is it okay if I write tests running against an actual instance of the 3rd party tool or should I employ mocks all the way?
When writing unit tests it is important to only test the functionality you are interested in. When you have a third party library, you are interested in one of two things when running a test:
Does the third party software behave correctly
You can write a test for a third party library and treat the tool as a black box, so you aren't testing the internals but if it behaves consistently. Pseudo code of such a test
//testing if a value is automatically timestamped
expected = "expected value"
tool.setValue("myKey", expected)
actual = tool.getValue("myKey")
assertThat(actual, endsWith(expected))
assertThat(actual, startsWith(dateToday()))
This test will formalises your assumptions and expectations about the behaviour and can be useful if you change the third party tool and want to see if it still behaves as you expect it to. You don't care about the internals, just how you use it. This can be useful to verify when upgrading to a newer version of the tool itself or if you switch to an alternative and want to ensure it works the same way. Important to note is that it works the same way, as far as your expectations go - whatever you change to could be faster or maybe communicate over the network or have some other effect that you don't care about.
Does your code behave correctly
In this case, you'd be writing a test that only tests your own code. Unit tests isolate the functionality, so you can replace that tool with a mock, in order to only verify your own code is correct. For example, if you switch to a new version of the tool that doesn't do timestamps, do don't want your test to for external reasons.
Here is a sample pseudo code of what a test will look like:
//check your code inserts the correct values without modifying them
mockTool = mock(SomeThirdPartyTool)
testInstance = new MyClass(mockTool)
expected = "some value"
expect(mockTool.insertValue()).toBeCalledWith(expected)
testInstance.insertValue(expected)
assertThat(expectationSatisfied())
In this case, changes to the third party tool would not influence the test. If you change the configuration to add or remove a timestamp to the value, the test will still be correct. It would fail if you manually add a timestamp in your code. This is exactly what you want - your test only fails for one clear reason.

When should one use jest snapshot testing vs just regular assertions in unit test?

This question is just about unit tests.
Recently I've been reading a lot about snapshots and I'm really confused as to when exactly should I use snapshot testing vs just explicit assertions. I use react & jest & enzyme for unit testing
As far as I understand it definitely makes sense to use snapshot testing:
to check if the component rendered the way we expected it to with the expected props. That way we don't really have to have an assertion for each prop or each component that was rendered etc
Questions:
1) But when it comes to user interactions like blur or click, there could be many cases. In that case does it make sense to have a snap for each of those testcases ? Say I have 10 different cases that I want to test for onBlur. Then does it make sense to have 10 different snaps for that? I know we can use serializers to filter out what we want to see on the snap but isn't just regular data driven test ( which contains input & expected output provided by developer ) with a single assertion just better?
2) How about when I have a component which in turn renders a couple of child components & those child components render their children etc. In that case I mount & then take the snapshot. That snap becomes really huge, Again I know we can tweak it by use of serializers. But really whats so great about snapshots in this case?
3) Isn't having too many snapshots in general a bad thing ?
I also came across some fancy tools like jest-glamor-react etc which can be used to get the most out of snapshot testing. But really how do I figure out which scenario is best tested using snapshots & which is best using regular assertions? I read a bunch of articles but some people are really impressed with snapshots but the examples are really basic. Some people are totally against it & think plain old assertions are way better. Can someone please share their views ?
Why snapshot testing?
No flakiness: Because tests are run in a command line runner instead of a real browser or on a real phone, the test runner doesn't have to wait for builds, spawn browsers, load a page and drive the UI to get a component into the expected state which tends to be flaky and the test results become noisy.
Fast iteration speed: Engineers want to get results in less than a second rather than waiting for minutes or even hours. If tests don't run quickly like in most end-to-end frameworks, engineers don't run them at all or don't bother writing them in the first place.
Debugging: It's easy to step into the code of an integration test in JS instead of trying to recreate the screenshot test scenario and debugging what happened in the visual diff.
This was gotten from here
Now to answer your question:
I would use snapshot testing when I have to keep track of UI elements, ensuring nothing is changed without having intentionally made that change. Snapshot helps you achieve this.

Gherkin - maintaining state between scenarios

Although I have been writing unit tests for 20-odd years, I am new to Gherkin, and haven been given the task of implementing a story for a .feature file that reduces to something like this:
Scenario: a
Given that the app is open
When I open a certain dialog
Then it has a thing somewhere
Scenario: b
Given that the dialog from 'a' is open...
# Imagine here a long chain of scenarios, each depending on the previous
Scenario: n
Given that the previous 'n' steps have all completed....
That is, a long, long chain of scenarios, each depending on the state of the system as configured by its predecessor.
This doesn't feel right to someone used to unit testing -- but these scenarios are not going to be split and run separately.
What's the best practice here?
Should I rewrite into one very long scenario?
I am already using a 'page object' to keep most of my code out of the step definitions -- should I be coding the steps as single calls, that can be re-used in the later scenarios?
I'm running Cucumber in Javascript.
First things first, Warning:
For the majority of tests (and by majority I mean 99.9% of the time), you should not carry on from the previous scenario, because of the fact that if one scenario fails in your feature, more will crumble because you attempted to string them together.
And on to my answer:
Depending on whether you are trying to do a set up for all of your scenarios after (within the same feature), or whether you want to reuse that first scenario multiple times (in separate features), you could do one of 2 things.
Make the first scenario a background
Make the first scenario into a step definition, for use in multiple feature files
For the First:
Background:
Given that the app is open
When I open a certain dialog
Then it has a thing somewhere
Scenario: a
Given that the dialog from 'a' is open...
Just remember that when you use it as a background, it will be used for all the following scenarios within that feature.
For the Second:
Scenario: a
Given that the app is open
When I open a certain dialog
Then it has a thing somewhere
Scenario: b
Given I have opened the dialogue from a
And the '<DialogFromA>' dialog is open...
I would ask myself, what is the actual behaviour behind all the steps?
Then I would implement that as the wanted behaviour and push the order between steps down the stack. Probably using one or many helper classes. There is nothing saying that you can force the order of the scenarios without introducing some hack to force them together.
Remember that BDD and Cucumber is all about human readable communication. The dependencies you are asking for should, in my opinion, be implemented in the support code Gherkin triggers.

Javascript Unit Testing - DOM Manipulation

I'm quite new to Javacript Unit testing. One thing keep bothering me. When testing javascript, we often need to do the DOM manipulation. It looks like I am unit testing a method/function in a Controller/Component, but I still need to depend on the HTML elements in my templates. Once the id(or attributes used to be selectors in my test cases) is changed, my test cases also need to be CHANGED! Wouldn't this violate the purpose of unit testing?
One of the toughest parts of javascript unit testing is not the testing, it's learning how to architect your code so that it is testable.
You need to structure your code with a clear separation of testable logic and DOM manipulation.
My rule of thumb is this:
If you are testing anything that is dependent on the DOM structure, then you are doing it wrong.
In summary:Try to test data manipulations and logical operations only.
I respectfully disagree with #BentOnCoding. Most often a component is more than just its class. Component combines an HTML template and a JavaScript/TypeScript class. That's why you should test that the template and the class work together as intended. The class-only tests can tell you about class behavior. But they cannot tell you if the component is going to render properly and respond to user input.
Some people say you should test it in integration tests. But integration tests are slower to write/run and more expensive (in terms of time and resources) to run/maintain. So, testing most of your component functionality in integration tests might slow you down.
It doesn't mean you should skip integration tests. While integration and E2E tests may be slower and expensive than unit tests, they bring you more confidence that your app is working as intended. Integration test is where individual units/components are combined and tested as a group. It shouldn't be considered as an only place to test your component's template.
I think I'd second #BentOnCoding's recommendation that what you want to unit test is your code, not anything else. When it comes to DOM manipulation, that's browser code, such as appendChild, replaceChild etc. If you're using jQuery or some other library, the same still applies--you're calling some other code to do the manipulation, and you don't need to test that.
So how do you assert that calling some function on your viewmodel/controller resulted in the DOM structure that you wanted? You don't. Just as you wouldn't unit test that calling a stored procedure on a DB resulted in a specific row in a specific table. You need to instead think about how to abstract out the parts of your controller that deal with inputs/outputs from the parts that manipulate the DOM.
For instance, if you had a method that called alert() based on some conditions, you'd want to separate the method into two:
One that takes and processes the inputs
One that calls window.alert()
During the test, you'd substitute window.alert (or your proxy method to it) with a fake (see SinonJS), and call your input processor with the conditions to cause (or not cause) the alert. You can then assert different values on whether the fake was called, how many times, with what values, etc. You don't actually test window.alert() because it's external to your code. It's assumed that those external dependencies work correctly. If they don't, then that's a bug for that library, but it's not your unit test's job to uncover those bugs. You're only interested in verifying your own code.

Categories