So I have a new assignment in university consisting of lots of people collaborating, and we want to use continuous integration, thinking of using CircleCI, and we want to use a TDD approach.
My biggest question is how do you correctly use TDD. I might have the wrong idea but from what I understand you write all your tests first and make them fail, because you don't have any code yet, but how can I write all my tests if I don't even know yet all the units I will have/need?
In this case since using CircleCI, assuming it won't let me merge code if it doesn't pass the tests, how can this work? Since there will be tests written but no code for that test specifically.
Am I wrong and you write the tests as you go along on the development of the features?
This is a subject that I am really having a hard time grasping but I would really love to understand it right as I believe it will really help on the future.
My biggest question is how do you correctly use TDD. I might have the wrong idea but from what I understand you write all your tests first and make them fail, because you don't have any code yet, but how can I write all my tests if I don't even know yet all the units I will have/need?
Not quite the right idea.
You might start by thinking about the problem, and creating a checklist of tests that you expect to implement before you are done.
But the actual implementation cycle is incremental. We work on one test at a time, starting from the first. We make that test pass, and clean up all of the code, before we introduce a second test.
The idea here being that we'll be learning as we go -- we may think of some more tests, which get added to the checklist, or we may decide the tests we thought would be important aren't after all, so they get crossed off the checklist.
At any given point in time, we expect that either (a) all of the implemented tests are passing, or (b) exactly one implemented test is failing, and it is the one we are currently working on. Any time we discover some other condition holds, then we back up, reverting to some previously well understood state, and then proceed forwards again.
We don't normally push/publish/share code when it has broken tests. Instead, the test and a working implementation are shared together. We don't share the broken intermediate stages, or known mistakes; instead, we share progress.
A review of the slides in the Bowling Game Kata may help to clarify what the rhythm of the work looks like.
It is completely normal to feel like the first test is hard -- you are writing a test implementation against code that doesn't exist yet. We tend to employ imagination here; suppose that the production code you need already exists, how would you invoke it? what data would you pass to it? What data would you get back? and you write the test as though the perfect interface for what you want to do already exists. Then you create production code that matches that interface; then you give that production code the correct behavior; then you give the production code a design that will make the code easy to change later.
And when you are happy with all of that, you introduce the second test, which usually looks like the first test with slightly different data, and a different expected result. So the second test fails, and then you go to the easy-to-change code you wrote before, and adapt it so that the second test also passes. And then you again clean up the design so that the code is easily changed.
And so it goes, until you reach the end of your checklist.
Related
I'm not sure if this is the right forum to ask this question if not kindly point me in the right direction.
I wanted to create a library/client for a 3rd party tool, which is similar to redis. And for the unit/integration tests, I see that in predis library, they have tests which directly interacts with a running redis instance and there are tests which make use of mocks.
So my question is that, is it okay if I write tests running against an actual instance of the 3rd party tool or should I employ mocks all the way?
When writing unit tests it is important to only test the functionality you are interested in. When you have a third party library, you are interested in one of two things when running a test:
Does the third party software behave correctly
You can write a test for a third party library and treat the tool as a black box, so you aren't testing the internals but if it behaves consistently. Pseudo code of such a test
//testing if a value is automatically timestamped
expected = "expected value"
tool.setValue("myKey", expected)
actual = tool.getValue("myKey")
assertThat(actual, endsWith(expected))
assertThat(actual, startsWith(dateToday()))
This test will formalises your assumptions and expectations about the behaviour and can be useful if you change the third party tool and want to see if it still behaves as you expect it to. You don't care about the internals, just how you use it. This can be useful to verify when upgrading to a newer version of the tool itself or if you switch to an alternative and want to ensure it works the same way. Important to note is that it works the same way, as far as your expectations go - whatever you change to could be faster or maybe communicate over the network or have some other effect that you don't care about.
Does your code behave correctly
In this case, you'd be writing a test that only tests your own code. Unit tests isolate the functionality, so you can replace that tool with a mock, in order to only verify your own code is correct. For example, if you switch to a new version of the tool that doesn't do timestamps, do don't want your test to for external reasons.
Here is a sample pseudo code of what a test will look like:
//check your code inserts the correct values without modifying them
mockTool = mock(SomeThirdPartyTool)
testInstance = new MyClass(mockTool)
expected = "some value"
expect(mockTool.insertValue()).toBeCalledWith(expected)
testInstance.insertValue(expected)
assertThat(expectationSatisfied())
In this case, changes to the third party tool would not influence the test. If you change the configuration to add or remove a timestamp to the value, the test will still be correct. It would fail if you manually add a timestamp in your code. This is exactly what you want - your test only fails for one clear reason.
Let's say you have a typescript object, where any element can be undefined. If you want to access a heavily nested component, you have to do a lot of comparisons against undefined.
I wanted to compare two ways of doing this in terms of performance: regular if-else comparisons and the lodash function get.
I have found this beautiful tool called jsben were you can benchmark different pieces of js code. However, I fail to interpret the results correctly.
In this test, lodash get seems to be slightly faster. However, if I define my variable in the Setup block (as opposed to the Boilerplate code), the if-else code is faster by a wide margin.
What is the proper way of benchmarking all this?
How should I interpret the results?
Is get so much slower that you can make argument in favour of if-else clauses, in spite of the very poor readability?
I think you're asking the wrong question.
First of all, if you're going to do performance micro-optimization (as opposed to, say, algorithmic optimization), you should really know whether the code in question is a bottleneck in your system. Fix the worst bottlenecks until your performance is fine, then stop worrying overmuch about it. I'd be quite surprised if variation between these ever amounted to more than a rounding error in a serious application. But I've been surprised before; hence the need to test.
Then, when it comes to the actual optimization, the two implementations are only slightly different in speed, in either configuration. But if you want to test the deep access to your object, it looks as though the second one is the correct way to think about it. It doesn't seem as though it should make much difference in relative speeds, but the first one puts the initialization code where it will be "executed before every block and is part of the benchmark." The second one puts it where "it will be run before every test, and is not part of the benchmark." Since you want to compare data access and not data initialization, this seems more appropriate.
Given this, there seems to be a very slight performance advantage to the families && families.Trump && families.Trump.members && ... technique. (Note: no ifs or elses in sight here!)
But is it worth it? I would say not. The code is much, much uglier. I would not add a library such as lodash (or my favorite, Ramda) just to use a function as simple as this, but if I was already using lodash I wouldn't hesitate to use the simpler code here. And I might import one from lodash or Ramda, or simply write my own otherwise, as it's fairly simple code.
That native code is going to be faster than more generic library code shouldn't be a surprise. It doesn't always happen, as sometimes libraries get to take shortcuts that the native engine cannot, but it's likely the norm. The reason to use these libraries rarely has to do with performance, but with writing more expressive code. Here the lodash version wins, hands-down.
What is the proper way of benchmarking all this?
Only benchmark the actual code you are comparing, move as much as possible outside of the tested block. Run every of the two pieces a few (hundred) thousand times, to average out the influence of other parts.
How should I interpret the results?
1) check if they are valid:
Do the results fit your expectation?
If not, could there be a cause for that?
Does the testcase replicate your actual usecase?
2) check if the result is relevant:
How does the time it takes compare to the actual time in your usecase? If your code takes 200ms to load, and both tests run in under ~1ms, your result doesnt matter. If you however try to optimize code that runs 60 times per second, 1ms is already a lot.
3) check if the result is worth the work
Often you have to do a lot of refactoring, or you have to type a lot, does the performance gain outweight the time you invest?
Is get so much slower that you can make argument in favour of if-else clauses, in spite of the very poor readability?
I'd say no. use _.get (unless you are planning to run that a few hundred times per second).
I recently watched a number of talks from the AssertJS conference (which I highly recommend), among them #kentcdodds "Write Tests, Not Too Many, Mostly Integration." I've been working on an Angular project for over a year, have written some unit tests, and just started playing with Cypress, but I still feel this frustration around integration tests, and where to draw the lines. I'd really love to talk to some pro who does this day in and day out, but I don't know any where I work. Since I'm tired of not being able to figure this out, I thought I'd just ask the world here, cause you all are fantastic.
So in Angular (or React or Vue, etc), you have component code, and then you have the HTML template, and usually they interact in some way. The component code has functions in it that can be unit tested, and that part I'm ok with.
Where I haven't gotten things straight in my mind is, do you call it an integration test when you're testing how a component function changes the UI? If you're testing that kind of thing, should that be done just in E2E tests? Because Angular/Jasmine(or Jest) lets you do this kind of thing, referencing the UI:
const el = fixture.debugElement.queryAll(By.css('button'));
expect(el[0].nativeElement.textContent).toEqual('Submit')
But does that mean you should? And if you do, then do you not cover that in your E2E tests?
And regarding integration with things like services, how far do you go with integrating? If you mock the actual HTTP call, and just test that it would get called with the right functions, is that an integration test, or is it still a unit test?
To sum up, I intuitively know what I need to test to have confidence that things are working as they should, I'm just not sure how to discern when something requires all three kinds of tests or not.
I know this is getting long, but here's my example app:
There's a property called hasNoProducts that is set after a product is chosen and data is returned from the server (or not if there is none). If hasNoProducts is true, UI (through an *ngIf) shows that "Sorry" message. If false, then other selections become available. Depending on the product picked, those options change.
So I know I can write a unit test and mock the HTTP request so that I can test that hasNoProducts is set correctly. But then, I want to test that the message is displayed, or that the additional options are displayed. And if there is data, test that switching the product changes the data in the other lists that would subsequently show on screen. If I do that using Angular/Jasmine, is it an integration test since I'm "integrating" component and template? If not, then what would be an integration test?
I could keep asking questions, but I'll stop there in the hopes that someone has read this far and has some insight. Again, I've read tons of articles, watched tons of videos and done tutorials, but every time I sit down to apply to a real project, I get stuck on things like this, and I want to get past this! Thanks in advance.
What distinguishes unit-tests and integration-tests (and then subsystem-tests and system-tests) is the goal that you want to achieve with the tests.
The goal of unit-testing is to find those bugs in small pieces of code that can be found if these pieces of code are isolated. Note that this does not mean you truly must isolate the code, but it means your focus is the isolated code. In unit-testing, mocking is very common, since it allows to stimulate scenarios that otherwise are hard to test, or it speeds up build and execution times etc., but mocking is not mandatory: For example, you would not mock calls to a mathematical sin() function from the standard library, because the sin() function does not keep you from reaching your testing goals. But, leaving the sin() function in does not turn these tests into integration tests. Strictly speaking, you could even have unit-tests where some real network accesses take place (if you are too lazy to mock the network access), but due to the non-determinism, delays etc. these unit tests would be slow and unreliable, which means they would simply not be well suited to specifically find the bugs in the isolated code. That's why everybody says that "if there is some real network access, it is not a unit-test", which is not formally but practically correct.
Since in unit-testing you intentionally only focus on the isolated code, you will not find bugs that are due to misunderstandings about interactions with other components. If you mock some depended-on-component, then you implement these mocks based on your understanding of how the other component behaves. If your understanding is wrong, your mock implementations will reflect your wrong understanding, and your unit-tests will succeed, although in the integrated system things will break. That is not a flaw of unit-testing, but simply the reason why there are other test levels like integration testing. In other words, even if you do unit-testing perfectly, there will unavoidably remain some bugs that unit-testing is not even intending to find.
Now, what are integration tests then? They are defined by the goal to find bugs in the interactions between (already tested) components. Such bugs can, for example, be due to mutual misconceptions of the developers of the components about how an interface is meant to work. For example, in the case of a library component B that is used from A: Does A call functions from the right component B (rather than from C), do the calls happen while B is already in a proper state (B might not be initialized yet or in error state), do the calls happen in the proper order, are the arguments provided in the correct order and have values in the expected form (e.g. zero based index vs. one based index?, null allowed?), are the return values provided in the expected form (returned error code vs. exception) and have values in the expected form? This is for one integration scenario - there are many others, like, components exchanging data via files (binary or text? which end-of-line marker: unix, dos, ...?, ...).
There are many possible interaction errors. To find them, in integration testing you integrate the real components (the real A and the real B, no mocks, but possibly mocks for other components) and stimulate them such that the different interactions actually take place - ideally in all interesting ways, like, trying to force some boundary cases in the interaction (exchanged file is empty, ...). Again, just the fact that the test operates on a software where some components are integrated does not make it an integration test: Only if the test is specifically designed to initiate interactions such that bugs in these interactions become apparent, then it is an integration test.
Subsystem tests (which are the next level) then focus, again, on the remaining bugs, that is, those bugs which neither unit-testing nor integration testing intend to find. Examples are requirements on the component C that were not considered when C was decomposed into A and B, or, if C is built using some outdated version of A where some bug was still in. However, when climbing up from unit-testing via integration testing to subsystem testing and above, it is a challenge to stay focused: Only to have tests for bugs that could not have been found before, and not to, say, repeat unit-tests on subsystem level.
This question is just about unit tests.
Recently I've been reading a lot about snapshots and I'm really confused as to when exactly should I use snapshot testing vs just explicit assertions. I use react & jest & enzyme for unit testing
As far as I understand it definitely makes sense to use snapshot testing:
to check if the component rendered the way we expected it to with the expected props. That way we don't really have to have an assertion for each prop or each component that was rendered etc
Questions:
1) But when it comes to user interactions like blur or click, there could be many cases. In that case does it make sense to have a snap for each of those testcases ? Say I have 10 different cases that I want to test for onBlur. Then does it make sense to have 10 different snaps for that? I know we can use serializers to filter out what we want to see on the snap but isn't just regular data driven test ( which contains input & expected output provided by developer ) with a single assertion just better?
2) How about when I have a component which in turn renders a couple of child components & those child components render their children etc. In that case I mount & then take the snapshot. That snap becomes really huge, Again I know we can tweak it by use of serializers. But really whats so great about snapshots in this case?
3) Isn't having too many snapshots in general a bad thing ?
I also came across some fancy tools like jest-glamor-react etc which can be used to get the most out of snapshot testing. But really how do I figure out which scenario is best tested using snapshots & which is best using regular assertions? I read a bunch of articles but some people are really impressed with snapshots but the examples are really basic. Some people are totally against it & think plain old assertions are way better. Can someone please share their views ?
Why snapshot testing?
No flakiness: Because tests are run in a command line runner instead of a real browser or on a real phone, the test runner doesn't have to wait for builds, spawn browsers, load a page and drive the UI to get a component into the expected state which tends to be flaky and the test results become noisy.
Fast iteration speed: Engineers want to get results in less than a second rather than waiting for minutes or even hours. If tests don't run quickly like in most end-to-end frameworks, engineers don't run them at all or don't bother writing them in the first place.
Debugging: It's easy to step into the code of an integration test in JS instead of trying to recreate the screenshot test scenario and debugging what happened in the visual diff.
This was gotten from here
Now to answer your question:
I would use snapshot testing when I have to keep track of UI elements, ensuring nothing is changed without having intentionally made that change. Snapshot helps you achieve this.
I currently coded something awesome based on a JS Library, and I want to remove every useless line that is doing nothing but take more time to download from the server.
What is the best way to keep track of what piece of code was used/unused when my code executed? (So I can remove the unused code)
My first thoughts are to put "var log += "<br />Function name was used." on every function and remove every function that wasn't called. But I am curious about if there is another way
I want to point out that modifying certain JS Libraries might violate their Licences and possibly cause strong legal issues. So if anyone is reading this and is planning to do the same thing as me, please read carefully the Licence(s) before you even attempt to do this!
In my estimation, the best way to keep track of which code has actually executed would be to use a code coverage measurement tool. There are several available for Javascript, many of which are outlined in a previous question: https://stackoverflow.com/questions/53249/are-there-any-good-javascript-code-coverage-tools .
Of course, this only tracks the code that has executed as a result of the test suite you are running it against, and would not be a foolproof way to find "completely dead" (i.e. unreachable) code. But it's a start...