Welcome to PhantomPilots.com

Sign up for a weekly email of the latest drone news & information

Anatomy of a DJI Flyaway

Discussion in 'General Discussion' started by ianwood, May 24, 2015.

  1. ianwood

    ianwood Taco Wrangler
    Staff Member

    Joined:
    Jan 7, 2014
    Messages:
    4,928
    Likes Received:
    1,800
    Location:
    Lost Angeles
    I had my first ever bonafide flyaway about 6 weeks ago. I had my theories, bought $1,000 in replacement parts, rebuilt and moved on. Then it happened again.



    This time, however, I had a data logger attached to the Phantom, a custom made miniature "black box" that records 75+ different flight parameters 5 times per second. I've spent the past day analyzing the data and I've discovered some things that I didn't suspect before.

    [​IMG]

    None of the sensors including IMU, compass, GPS, etc. were to blame. The battery, ESCs, motors did exactly as they were supposed to do. There was no loss of connection, GPS interference or any other external factors.

    The NAZA locked up. It had intermittent freezes that stopped it from functioning normally. Here is a snapshot of the logged data from when the NAZA stopped working normally.

    [​IMG]

    At 12 minutes, 45 seconds, the data shows 5 anomalies:
    • The Course Over Ground (COG) freezes. It had been fluid all throughout the earlier part of the flight.
    • The speed (Total SPD) changes to 769,361.92 m/s. It had been normal throughout the earlier part of the flight.
    • The satellite count (SAT) jumps to 204 then returns to normal.
    • Erratic updates of ATTI pitch and roll values. 2Hz or slower.
    • Erratic updates of motor speed. 2Hz or slower.
    The cause is likely to be one of two things:
    • Defective hardware in NAZA (e.g. cold solder joint) that took several flights of light vibration to manifest.
    • Firmware defect when flying in edge condition (corner of the envelop) that rarely manifests.
    I am working on recovering more of the video (file is corrupted due to loss of power).

    I've included a more detailed analysis of the data which is in the PDF link here: http://www.ianwood.com/docs/anatomy-of-a-dji-flyaway-v1.pdf
     
    #1 ianwood, May 24, 2015
    Last edited: May 26, 2015
  2. Jacob

    Jacob Administrator
    Staff Member

    Joined:
    Mar 3, 2015
    Messages:
    737
    Likes Received:
    305
    Wow, this is very impressive. I would for sure send this on to DJI so you can get a replacement.
     
    ianwood likes this.
  3. discv

    Joined:
    May 7, 2013
    Messages:
    619
    Likes Received:
    26
    Location:
    London, UK
    The most riveting post I've seen on this or any other UAV forum.

    Does this not prove what we suspected right back in the day of the early P1's? That the naza freezes.

    Ian, a muppet guide on how you gathered this data please.
     
    TMSmalley and ianwood like this.
  4. Marlin009

    Joined:
    Dec 31, 2014
    Messages:
    1,091
    Likes Received:
    515
    Location:
    West Coast of Florida
    Wow is right!

    Sorry to hear about a second crash but glad you were prepared.
    You were prepared, understatement of the year.
     
    ianwood likes this.
  5. Hughie

    Joined:
    Nov 22, 2014
    Messages:
    1,477
    Likes Received:
    141
    Location:
    Leeds, United Kingdom
    Great job Ian.

    Edge conditions are not going to be easy to investigate... what about examination of NAZA for cold joints etc. I remember seeing a report/youtube clip somewhere of some guys who flew multicopters commercially having a problem which they traced (I think !) to a cold joint on one of the NAZA connectors. I wonder if that is still about.
     
    ianwood likes this.
  6. dirkclod

    dirkclod Moderator
    Staff Member

    Joined:
    May 9, 2014
    Messages:
    10,669
    Likes Received:
    5,710
    Location:
    Amory Mississippi
    Was your second crash with the same bird you rebuilt Ian ?
     
  7. Trumple

    Joined:
    May 19, 2015
    Messages:
    279
    Likes Received:
    73
    Location:
    UK
    So this was an issue with the NAZA on the P2 then? It seems odd that this would also happen on other models through various updates and firmware changes. Certainly sounds like a firmware bug
     
  8. Hughie

    Joined:
    Nov 22, 2014
    Messages:
    1,477
    Likes Received:
    141
    Location:
    Leeds, United Kingdom
    Out of interest, why have you ruled out hardware? E.g. flawed manufacturing process/ cold joint aggravated by vibration and age. If this did happen twice in the same aircraft I would suspect hardware over firmware.
     
  9. Tyrone the drone

    Joined:
    Apr 26, 2015
    Messages:
    48
    Likes Received:
    7
    Looks like DJI has some explaining to do!
     
    lamiker likes this.
  10. Trumple

    Joined:
    May 19, 2015
    Messages:
    279
    Likes Received:
    73
    Location:
    UK
    Happening twice in the same aircraft doesn't necessarily rule out hardware or software (or indeed suggest either was the cause) - I think perhaps since one aircraft could have it after just a few flights whilst another could fly hundreds of flights successfully, that does actually suggest hardware.

    My reasoning for leaning more towards firmware is purely based on personal experience. I've just completed a 4-year degree in electronics engineering, and throughout my time these kind of intermittent problems usually point to software/firmware, particularly for the following reasons:
    • It's the NAZA flight controller which is causing the issue, which is most likely just a number of microprocessors on a PCB. Microprocessors are held in place physically by soldering joints and these are highly unlikely to fail if they don't work the first time. In contrast, if it was a sensor that was causing the issue, this would more likely point to a hardware issue e.g. wear and tear of mechanical parts or failure of analogue components leading to strange readings.
    • In my experience, physical failures of microprocessors are highly uncommon and usually just result in total failure of the module, rather than erratic behaviour.
    • Firmware/software can be vastly more complex than the hardware. Particularly in a control module such as the NAZA, there is not much to mechanically go wrong, while there are potentially thousands of unaccounted for issues in firmware. Testing the hardware would be a relatively simple task, whereas thoroughly testing the firmware might take hundreds of man-hours. Generally a "good enough" approach is taken, and while I'm sure DJI test their firmware well, bugs will always be present in any complex system such as this.
    • These things are incredibly complex, and the software of such a central device is less easily tested than the hardware. The hardware in the case of the NAZA likely revolves around the microprocessors, and those are often standard, produced by someone like Texas Instruments, are deployed worldwide in hundreds of different applications, and have a much more rigorous testing phase than DJIs firmware does.
    If you're unconvinced, take a look at what happened to NASA's Mars Pathfinder rover in 1997:
    http://research.microsoft.com/en-us/um/people/mbj/mars_pathfinder/mars_pathfinder.html
    The hardware worked well, but despite their enormous budget, even NASA missed some multi-threading bugs.
     
    ianwood likes this.
  11. 750r

    Joined:
    Nov 27, 2013
    Messages:
    1,226
    Likes Received:
    420
    Location:
    Pa, USA
    Is this the same NAZA from first flyaway/crash ?
    What software are you running ?
    How much damage happened this time ?
    $1000 to rebuild ?
    Where you in the same place as the first crash ?
    Sorry bud your having some bad luck :(
     
  12. Fyod

    Joined:
    May 21, 2014
    Messages:
    684
    Likes Received:
    61
    Location:
    Central EU
    What I'd like to know is if the non-Lite Naza has the same known problem.
    Because then a firmware re-flash to the non-Lite would solve this, though controversial and possibly "piracy"."

    Also, I agree with Trumple. The PCBs and soldering inside the Phantom is pretty high quality. Even the PCB in the controller, where you wouldn't really expect it, has an ENIG finish. The solder joints that we have seen pictures of failure on are caused by human factor; things that are not machine soldered, ie. motor to ESC wire. This is a connection a human solders.
    We have also seen IC failures on the ESC, but these are probably due to high heat and vibration.

    The Naza, although I've never taken it apart and only seen pictures, seems to be fully machine placed and soldered parts except the board-to-board and pin connectors. A failure could happen there, although I doubt it would cause the deviations posted above.
    Again, ENIG PCBs and even a clear lacquer coating ensure very good part-to-board connection. To break this connection, the part would have to suffer very high heat, enough to weaken the solder and in combination with vibration cause disconnection. If this happened, It would most likely not be able to fly again.

    And as mention by Trumple, it is virtually impossible to test firmware in all environments and conditions. This is probably why the Jhook problems took so long to solve. They never sent anyone to actually test the Jhook and did the firmware fix based on user data and the mathematics of GPS in relation to magnetic declination.
     
    #12 Fyod, May 24, 2015
    Last edited: May 24, 2015
    ianwood likes this.
  13. Hughie

    Joined:
    Nov 22, 2014
    Messages:
    1,477
    Likes Received:
    141
    Location:
    Leeds, United Kingdom
    Many of your points have some validity. But simply based on the fact that there are hundreds of thousands of aircraft exercising the same or very similar firmware for many hours without issue and the fact that a good deal of aircraft behaving like this seem to have done over 100 flights could be significant.
     
    Trumple likes this.
  14. Trumple

    Joined:
    May 19, 2015
    Messages:
    279
    Likes Received:
    73
    Location:
    UK
    Yep agreed. I think the fact that the issues seem to be limited to just a few aircraft indicates that it could be hardware related. If it was firmware, it should be more of an apparent problem. However, without any kind of statistics we can't really know one way or the other.
     
  15. Fyod

    Joined:
    May 21, 2014
    Messages:
    684
    Likes Received:
    61
    Location:
    Central EU
    Since only some of the GPS data is corrupt my best guess would be firmware failure in the GPS chip. This may not even be the Naza's fault, but the Naza should have some kind of failsafe for corrupt data, ie. when speed from GPS goes over XY, stop and slowly land regardless of GPS data.
    The corrupt data above makes all GPS function totally useless, even RTH wouldn't be possible unless it ignored COG and speed and tried to return with some default values.
     
    JKDSensei likes this.
  16. ZonComGMZ

    Joined:
    Dec 8, 2013
    Messages:
    189
    Likes Received:
    1
    I had a similar incident yesterday. Have P2V+. Calibrated Compass and had Homelock & GPS lock. Did a couple of previous flights at same location with no problems. On the third flight I was hovering about 20 feet in the air with no wind and it was rock steady. All of the sudden the P2V+ took a extreme bank and it then tried to correct itself wobbling all the way toward the ground. I was able to apply FULL Throttle Up to slow the descent so it just bumped the ground and flew back up. When it was back in the air I could not control it as normal. I had to give it full input in the opposite direction it started to go to to keep it in the normal "Sticks Off" position. I landed it, changed my shorts and checked everything out and it seemed OK except for the camera was now tilting due to the hard hop. I fixed that with no problems. I was in a park away from any sources of interference and I was unable to explain why the P2V+ changed back so abruptly without any input from the sticks. This is the second time this has happened and I am getting very weary of flying the P2V+ knowing that at any second it could just have a mind of its own.
     
  17. finlayson

    Joined:
    Jun 30, 2014
    Messages:
    91
    Likes Received:
    9
    FYI, I have software that will likely be able to repair your video file:
    http://live555.com/drones/DJI-video-fix/
     
  18. bartman

    Joined:
    May 25, 2015
    Messages:
    3
    Likes Received:
    0
    Ian

    were you flying around in GPS mode when this happened or stationary?

    kudos to you for having the skills to incorporate a data logger into your stock Phantom and for discovering this glitch. we discussed this a while ago at multirotorforums.com that the Phantom likely doesn't have the processing power to allow a user to fly around in a dynamic fashion while also engaged in GPS position hold mode. the data flow would be staggering as it attempts to calculate position hold info that is changing rapidly as the heli is flying around. even the APM 2.5 runs into a computational wall when an Okto is being asked to do too many things at the same time, hence the Pixhawk.

    we've concluded the same thing as a lot of other users, that the Phantom 3 is a great little helicopter for the money but the controller is not of the same caliber as other top shelf controllers (DJI A2, Mikrokopter FC 2.5, Pixhawk, etc.). Flying around while also having GPS position hold engaged just doesn't seem like a good idea and we've recommended against it.

    Great work!

    bart
    multirotorforums.com
     
    #18 bartman, May 25, 2015
    Last edited: May 25, 2015
  19. unclebob

    Joined:
    Jan 12, 2015
    Messages:
    42
    Likes Received:
    6
    You should get a full time salary from DJI. They would have paid someone to do this anyway.
     
    bartman likes this.
  20. Hughie

    Joined:
    Nov 22, 2014
    Messages:
    1,477
    Likes Received:
    141
    Location:
    Leeds, United Kingdom
    I have a feeling they may already know all about it :(
     
    GoodnNuff and dirkclod like this.